Final  Report 
Al&DS  TR- 10 15-02 


May,  1983 


DISTRIBUTED  HYPOTHESIS  FORMATION 
IN  DISTRIBUTED  SENSOR  NETWORKS 


FINAL  REPORT 


Prepared  for: 


Prepared  by: 


Dr.  Barry  M.  Leiner 
DARPA 

1400  Wilson  Boulevard 
Arlington,  VA  22209 


C.Y.  Chong 
S.  Mori 
E.  Tse 

R.P.  Wishner 


Approved  for  public  release;  distribution  unlimited. 


ADVANCED  INFORMATION  &  DECISION  SYSTEMS 


Mountain  View,  CA  94040 

E  COPY 


83  10  IS  024 


ADVANCED  INFORMATION 
&  DECISION  SYSTEMS 


201  San  Antonio  Circle,  Suite  286 
Mountain  View,  CA  94040-1270 
(415)  941-3912 


Final  Report  May,  1983 

AI&DS  TR-1015-02 


DISTRIBUTED  HYPOTHESIS  FORMATION 
IN  DISTRIBUTED  SENSOR  NETWORKS 


FINAL  REPORT 


C . Y .  Chong 
S.  Mori 
E.  Tse 

R.P.  Wishner 


Sponsored  by 

Defense  Advanced  Research  Projects  Agency  (DoD) 
DARPA  Order  No.  4272 


Under  Contract  No.  MDA903-81-C-0333  issued  by  Department  of 
Army,  Defense  Supply  Service-Washington,  Washington,  D.C.  20310 

The  views  and  conclusions  contained  in  this  document  are  those  o 
the  authors  and  should  not  be  interpreted  as  represenging  the 
official  policies,  either  expressed  or  implied,  of  the  Defense 
Advanced  Research  Projects  Agency  or  the  U.S.  Government. 


.ASSi*iC*Tion  0»  this  Pag C  fWfcM  Dele  Cm»>« 


REPORT  DOCUMENTATION  PAGE 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


t  ”  I T »  fw>«  Subtitle 


Distributed  Hypothesis  Formation  in 
Distributed  Sensor  Networks 


bjTuO  */•) 

C.Y.  Chong,  S.  Mori,  E.  Tse,  R.P.  Wishner 


Tybl  or  REPORT  A  PERIOD  COVERED 

Final  Report 

July  15,  1981  -  May  30,  1983 


».  CONTRACT  OR  GRANT  NUMKRf.j 


MDA903-81-C-0333 


.  •  S  R  F o Amin G  ORGANIZATION  NAME  anO  AOORESS 

Advanced  Information  &  Decision  Systems 
201  San  Antonio  Cir.,  Suite  286 
Mountain  View,  CA  94040-1270 


II.  CONTROLLING  OrriCE  NAME  ANO  AOORCSS 


11.  report  oate 

May,  1983 


is.  number  or  pages 

274 


Defense  Advanced  Research  Projects  Agency 
1400  Wilson  Boulevard 
Arlington,  VA  22209 


14.  monitoring  AGENCY  NAME  b  ADDRESSfll  bl  If  •rent  Iroe.  Caniralllnt  Olflcej  IS.  SECURITY  CLASS,  (al  Uite  rapan ) 

Defense  Supply  Service  -  Washington 

Room  ID  245,  The  Pentagon  „ 

Washington,  D.C.,  20310  UNCLASSIFIED _ 

IS.,  OCCL  ASSiriC  ATION/  downgrading 
schedule 


l«.  DISTRIBUTION  STATEMENT  (a!  ttila  Haparl) 


Approved  for  public  release;  distribution  unlimited. 


17.  3IS7RIIUTION  ST  ATEmCNT  (•/  III*  mnimrm4  In  fl«cJ(  20,  It  4itt%tM%i  tr+m i  Rmpmti) 


>t.  KKT  irotCS  /CwMlmi*  m  It  m>4  I4f\t!ty  ky  AIacJt  tHjmkmr) 


Multitarget  Tracking  Distributed  Sensor  Networks 

Multitarget  Surveillance  System  Distributed  Hypothesis  Formation 

Surveillance  Correlation  Distributed  Estimation  Systems 

Distributed  Systems  Information  Fusion 

Distributed  Multitarget  Tracking 


IO^aISTAaCT  (C •*(»<**•  m  tiv«  II  149*11  ty  kf  IIm*  mmk9t) 

This  report  presents  research  results  on  distributed  situation  assessment 
in  a  distributed  sensor  network  (DSN).  The  area  of  multitarget  tracking 
and  classification  has  been  chosen  to  investigate  issues  associated  with 
distributed  hypothesis  formation  and  evaluation.  A  general  theory  for 
Bayesian  multitarget  tracking  has  been  developed.  This  is  used  as  the 
basis  for  specifying  the  processing  architecture  at  each  node  in  the  DSN. 
Each  node  contains  the  Generalized  Tracker/Classifier  for  processing  of ^ 


(Continued) 


DD  I  J*N*TI  1473  COITION  OF  *  NOV  •*  II  OBSOLETE 
S/N  0  10  2-0  I  4*  1*01  | 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  OF  THIS  PACE  (*han  Oeie  lalawa*) 


UNCLASSIFIED 


i 


.  .  ..j  M|  t  T  CL  ASSif  ic  *  Tio«  'jr  This  P«ce/w>i«w  D.i.  gni  »..</) 


'local  sensor  data,  an  information  fusion  nodule  to  integrate  processed 
information  from  various  nodes,  and  an  information  distribution  module. 
The  problem  of  removing  redundant  information  in  a  general  distributed 
estimation  system  has  also  been  investigated.  Simulation  results  to 
study  various  issues  associated  with  distributed  situation  assessment 
are  presented. 


\ 


i 


UNCLASS I  FI  ED 


security  cl  Assiric  atio*  or  this  •Acer»r>»<’  *"»•»»* 3 


L 


j. 


TABLE  OF  CONTENTS 


Section  Page 

1.  INTRODUCTION  AND  SUMMARY .  1 

1.1  PROJECT  OBJECTIVES  AND  TECHNICAL  APPROACH  .  1 

1.2  TECHNICAL  ISSUES .  3 

1.3  PROJECT  ACCOMPLISHMENTS  .  4 

1.4  REPORT  ORGANIZATION  .  6 

2.  SYSTEM  DESCRIPTION  AND  NODAL  ARCHITECTURE  .  8 

2.1  SYSTEM  DESCRIPTION .  8 

2.2  STRUCTURE  OF  EACH  PROCESSING  NODE .  10 

2.2.1  Local  Processing  of  the  Sensor  Data .  12 

2.2.2  Information  Fusion . 12 

2.2.3  Information  Distribution .  13 

3.  GENERALIZED  TRACKER/CLASSIFIER .  14 

3.1  TARGET  AND  SENSOR  MODELS .  14 

3.1.1  Target  Model .  16 

3.1.2  Sensor  Model .  18 

3.2  HYPOTHESIS  FORMATION .  21 

3.2.1  Tracks  and  Hypotheses .  21 

3.2.2  Tree  Representation .  23 

3.3  HYPOTHESIS  EVALUATION  .  25 

3.4  FILTERING  AND  PARAMETER  ESTIMATION .  29 

3.5  HYPOTHESIS  MANAGEMENT  .  30 

3.5.1  Pruning .  31 

3.5.2  Combining .  33 

3.5.3  Windowing.  .  . . 39 

3.5.4  Clustering  .  . .  40 

4.  INFORMATION  FUSION  IN  DISTRIBUTED  GENERALIZED 

TRACKER/CLASSIFIER .  45 

4.1  PROBLEM  STATEMENT .  45 

4.1.1  Models .  46 

4.1.2  Tracks  and  Hypotheses .  47 

4.1.3  Information  Fusion  Problem .  49 

4.2  HYPOTHESIS  FORMATION . 51 

4.2.1  Example .  51 

4.2.2  Hypothesis  Formation  Procedure .  55 

4.3  HYPOTHESES  EVALUATION  .  56 

4.3.1  Static  Target  Models .  56 

4.3.2  Dynamic  Target  Models  .  59 

4.4  HYPOTHESIS  MANAGEMENT  .  60 


5.  INFORMATION  DISTRIBUTION  AND  PROBLEMS  IN 

DISTRIBUTED  ESTIMATION .  62 

5.1  WHAT  TO  DISTRIBUTE .  63 

5.2  ISSUES  RELATED  TO  DISTRIBUTED  ESTIMATION  NETWORKS  ....  64 

5.2.1  Basic  Results .  65 

5.2.2  Example .  67 

6.  NUMERICAL  EXAMPLES .  73 

6.1  INTRODUCTION .  73 

6.2  STATIONARY  TARGET  EXAMPLE  .  74 

6.2.1  Example  Scenario .  74 

6.2.2  Communication  Schemes  and 

Performance  Measures .  76 

6.2.3  Experimental  Results .  79 

6. 2. 3.1  Effect  of  Pruning  Thresholds .  83 

6. 2. 3. 2  Expected  Number  of  Targets .  90 

6. 2. 3. 3  Measurement  Error  .  91 

6. 2. 3. 4  False  Alarm  Rates  .  92 

6. 2. 3. 5  Probability  of  Detection .  93 

6.2.4  Summary  of  Results  and  Supplemental 

Discussions .  94 

6.3  AN  ALMOST  CONSTANT  VELOCITY  TARGET  EXAMPLE .  98 

6.3.1  Example  Scenario .  98 

6.3.2  Communication  Schemes  and  Performance 

Measures . 103 

6.3.3  A  Sample  Run . 105 

6. 3. 3.1  A  Sample  Run:  Data . 106 

6. 3. 3. 2  Sample  Run:  Decentralized  Scheme  .  Ill 

6. 3. 3. 3  Sample  Run:  Centralized  Scheme  .  112 

6. 3. 3. 4  Sample  Run:  Distributed  Scheme  ......  112 

6.3.4  Monte  Carlo  Simulation  Results . 114 

6. 3. 4.1  Pruning  Threshold  .  124 

6. 3. 4. 2  Target  Density . 124 

6. 3. 4. 3  Other  Parameters . 125 

6.4  SUMMARY  OF  NUMERICAL  EXAMPLES . 126 

7.  CONCLUSIONS . 129 


REFERENCES 
APPENDIX  A 
APPENDIX  B 
APPENDIX  C 


LIST  OF  FIGURES 


Figure 


Page 


2-1  Distributed  Sensor  Network.  .  9 

2- 2  Structure  of  Processing  Node . 11 

3- 1  Generalized  Tracker/Classifier . 15 

3-2  Hypothesis  Tree . 26 

3-3  Pruning  Strategies .  34 

3- 4  N-Scan  Hypothesis  Combining  .  38 

4- 1  Structure  of  Information  Fusion  Problem  .  .  50 

4-2  Local  and  Global  Hypotheses  -  an  Example . 52 

4- 3  Computation  of  Track  Likelihoods . 61 

5- 1  Communication  Structures . 69 

5- 2  A  Typical  Simulation  Run . 72 

6- 1  Sensitivity  to  Pruning  Threshold  .  84 

6-2  Sensitivity  to  Expected  Number  of  Targets  .  85 

6-3  Sensitivity  to  Measurement  Error . 86 

6-4  Sensitivity  to  False  Alarm  Rate . 87 

6-5  Sensitivity  to  Probability  of  Detection  .  88 

6-6  Sensitivity  to  Error  in  Expected  Number  of  Targets.  ...  96 

6-7  Two-Dimensional  Change  of  False  and  Missed  Tracks  ....  97 

6-8  Node  Layout . 100 

6-9  A  Priori  Distribution  of  Initial  Velocity  .  102 

6-10  A  Sample  of  10  Scans  of  Position  Measurements . 107 

6-11  A  Sample  of  10  Scans  of  Position  and  Velocity 

Measurements . 108 

6-12  True  Target  Trajectories  of  the  Sample . 109 

6-13  Measurement  Data  and  True  Target  Trajectories . 110 


Accession  For  _ 
NT  1 3  GF..‘.:-T 

OTIC  T48 

Unf'.n no  ;!i'‘ '  - 

n._ 


V 


By — 


Distribution/  _ 

Availability  Codes 
(Avail  and/or 


,s. 

,v 


LIST  OF  TABLES 


Table  Page 

4- 1  Hypothesis  Composition . 54 

5- 1  Sensor  Characteristics . 68 

5- 2  Simulation  Results . 70 

6- 1  Baseline  Parameters  for  Stationary  Case  .  77 

6-2  Performance  Indices  . . 78 

6-3  Baseline  Quantitative  Comparison . 80 

6-4  Baseline  Qualitative  Comparison  .  83 

6-5  Baseline  Parameters  for  Almost  Constant  Velocity  Case  .  .  103 

6-6  Baseline  Comparison  for  Dynamic  Targets  .  115 

6-7  Parameter  Values  for  Sensitivity  Study . 117 

6-8  Sensitivity  to  Pruning  Threshold . 118 

6-9  Sensitivity  to  Target  Density  .  119 

6-10  Sensitivity  to  Position  Measurement  Error  .  120 

6-11  Sensitivity  to  Velocity  Measurement  Error  .  121 

6-12  Sensitivity  to  False  Alarm  Rate . 122 

6-13  Sensitivity  to  Probability  of  Detection  .  123 


1.  INTRODUCTION  AMD  SUMMARY 


This  report  describes  research  on  the  distributed  processing  of 
sensor  data  for  situation  assessment  at  the  processing  nodes  in  a  dis¬ 
tributed  sensor  network  (DSN).  This  research  has  been  performed  at 
Advanced  Information  &  Decision  Systems  under  the  contract  entitled 
"Distributed  Hypothesis  Formation  in  Distributed  Sensor  Networks". 

Distributed  sensor  networks  have  many  positive  attributes  such  as 
improved  performance,  faster  response  time,  more  flexible  communication 
and  less  vulnerability  as  compared  with  centralized  or  hierarchical  sys 
terns.  As  a  result  they  are  attractive  for  many  Department  of  Defense 
applications.  These  DSNs  can  consist  of  a  variety  of  sensor  types 
(e.g.,  microwave  radar,  S1GINT,  IE,  etc.)  and  are  relevant  to  a  variety 
of  defense  systems  (e.g.,  air  defense,  land  warfare,  space  defense, 
etc.).  However,  many  research  issues  need  to  be  addressed  before  such 
DSN  systems  can  be  designed,  built  and  achieve  their  military  potential 
In  this  project  we  have  addressed  and  resolved  some  of  these  issues. 
This  final  report  summarizes  the  results  of  our  investigation. 


1.1  PROJECT  OBJECTIVES  AMD  TECHNICAL  APPROACH 

The  overall  objective  of  this  research  project  was  to  advance  the 
state  of  the  art  in  distributed  situation  assessment  in  distributed  sen 
sor  networks.  Specii  cally,  set  out  to  accomplish  the  following 


goals : 

-  investigate  techniques  of  hypothesis  representation,  formation  and 
evaluation,  etc.  in  distributed  sensor  networks; 

-  investigate  various  tradeoffs  such  as  computation  versus  communica¬ 
tion,  and  the  performance  of  centralized,  decentralized  and  distri¬ 
buted  structures  as  a  function  of  various  parameters. 

The  basic  system  model  consists  of  a  distributed  system  of  nodes 
which  are  connected  in  a  packet  switching  network.  Each  node  contains  a 
processor  and  one  or  more  sensors,  whose  coverage  may  overlap  those  of 
sensors  at  other  nodes.  The  input  information  at  each  node  consists  of: 

-  own  sensor  data 

-  messages  from  other  nodes 

-  contextual  information 

Our  Approach  has  been  to  understand  the  main  technical  issues  asso¬ 
ciated  with  a  DSN  by  concentrating  on  the  hypothesis  representation, 
formation,  and  evaluation  processes.  The  tracking  and  classification  of 
multiple  targets  in  low  signal-to-noise  ratio  and  high  clutter  environ¬ 
ment  was  chosen  as  the  particular  application  area.  Our  rationale  for 
concentrating  on  hypothesis  representation,  formation,  and  evaluation 
was  that  in  a  dense  target  environment,  with  a  low  detection  probability 
and  high  false  alarm  rates,  successful  tracking  and  classification  of 
targets  depends  very  much  on  forming  the  correct  (data  association) 
hypotheses.  Thus,  multitarget  tracking  and  classification  provides  a 
rich  problem  domain  to  study  distributed  problems.  In  addition,  such 
problems  have  many  applications  in  the  defense  area. 
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We  have  adopted  an  approach  which  is  both  analytical  and  heuristic. 
A  DSN  is  primarily  an  engineered  information  gathering  and  processing 
system.  Detailed  mathematical  models  of  sensors,  targets,  and  the 
environment  are  usually  available.  They  are  used  to  provide  inputs  for 
generating  optimal  algorithms  (in  a  precise  mathematical  sense)  for  pro¬ 
cessing  the  data  at  the  nodes.  These  algorithms  are,  however,  only 
applicable  under  ideal  situations  when  there  are  no  computation  and/or 
communication  constraints.  In  a  more  realistic  environment,  where  the 
DSN  is  supposed  to  operate,  these  constraints  cannot  be  ignored.  They 
are  incorporated  into  the  algorithms  by  the  use  of  heuristics.  In  par¬ 
ticular,  picking  „he  right  hypothesis  can  be  regarded  as  a  tree  search 
problem.  Some  artificial  intelligence  (AI)  based  ideas  are  used  to  gen¬ 
erate  the  heuristics  for  managing  the  search.  Although  AI  provides  use¬ 
ful  tools  for  handling  distributed  hypothesis  formation  problems,  its 
role  in  this  project  has  been  limited  to  hypothesis  management.  This 
limitation  was  due  to  the  fact  that  a  mathematical  foundation  for  dis¬ 
tributed  multitarget  tracking  was  still  lacking  and  had  to  be  developed 
before  the  consideration  of  more  complicated  issues. 

1.2  TECHNICAL  ISSUES 

There  are  many  technical  issues  associated  with  distributed  sys¬ 
tems.  The  following  are  the  ones  which  are  particularly  relevant  to 
using  a  DSN  for  multitarget  tracking  and  classification: 

•  Local  situation  assessment 


-  how  to  represent  and  form  hypotheses 

-  how  to  evaluate  the  goodness  of  a  hypothesis 

-  how  to  keep  the  growing  number  of  hypotheses  within  the  computa¬ 
tional  constraints 

•  Communication  level 

-  what  to  communicate 

-  data  versus  hypotheses 

-  information  to  accompany  hypotheses 

-  when  to  communicate 

-  how  to  integrate  the  incoming  information  into  the  local  infor¬ 
mation 

-  how  to  process  incoming  information 

-  how  to  avoid  redundant  processing  of  the  same  information 

These  are  some  of  the  issues  which  need  to  be  addressed  before  a  DSN  can 
be  designed.  We  have  studied  these  issues  in  the  project,  both  analyti¬ 
cally  and  experimentally  through  computer  simulations. 

1.3  PROJECT  ACCOMPLISHMENTS 

Little  had  previously  been  done  on  distributed  multitarget  tracking 
and  classification.  Even  in  the  centralized  case,  the  existing  results 
are  not  good  enough  to  provide  a  sound  foundation  for  developing  distri¬ 
buted  algorithms.  To  supply  this  foundation,  we  have  developed  a  theory 
for  centralized  multitarget  tracking  and  classification.  The  resulting 
algorithm,  called  the  Generalized  Tr acker /Class if ier  (GTC),  provides  a 
definitive  treatment  of  hypothesis  formation,  evaluation,  and  management 
in  a  centralized  processing  system.  This  theory  addresses  most  of  the 
technical  issues  associated  with  local  data  processing  for  situation 


assessment . 


The  centralized  multitarget  tracker  and  classifier  has  been  used  to 
develop  algorithms  for  processing  the  incoming  information  at  a  node  and 
integrating  it  with  the  local  information.  The  resulting  algorithm, 
called  the  Distributed  Generalized  Tracker /Classifier,  prescribes  the 
appropriate  processing  architecture  at  a  node  and  represents  a  sys¬ 
tematic  treatment  of  distributed  multitarget  tracking.  Each  processing 
node  consists  of  three  modules:  the  Generalized  Tracker/Classifier  for 
processing  the  local  sensor  data,  an  information  fusion  module  to  handle 
the  information  from  other  nodes,  and  an  information  distribution  module 
for  transmission  of  messages  to  other  nodes.  Many  issues  dealing  with 
information  integration  at  a  node  have  been  addressed.  One  of  the  most 
important  has  been  the  development  of  ways  to  avoid  redundant  use  of  the 
same  information  at  a  node. 

The  algorithms  have  been  coded  and  simulations  have  been  performed 
for  various  distributed  scenarios  to  resolve  the  issues  dealing  with  the 
trade-off  of  communication  versus  computation.  We  have  discovered  that 
there  is  a  delicate  trade-off.  Because  the  number  of  hypotheses  tends 
to  grow  rapidly  as  the  amount  of  data  increases,  having  more  data,  as  in 
a  centralized  situation,  is  not  necessarily  better  unless  resources  are 
available  to  process  the  data.  In  general,  the  quality  of  the  informa¬ 
tion  is  more  important  than  the  amount  of  data.  With  a  proper  distri¬ 
buted  algorithm,  performance  similar  to  that  of  the  centralized  scheme 


can  be  achieved. 


1 .4  REPORT  ORGANIZATION 


Results  obtained  earlier  in  the  project  have  been  documented  in  an 
interim  technical  report  and  several  papers.  The  interim  technical 
report  [1]  and  the  paper  [2]  contain  a  description  of  the  centralized 
Generalized  Tracker/Classifier.  A  summary  of  the  overall  project  has 
been  reported  in  (3],  and  [4]  describes  a  framework  for  the  general  dis¬ 
tributed  estimation  problem. 

The  rest  of  the  report  is  organized  as  follows.  In  Section  2  we 
describe  the  system  and  the  major  components  at  each  processing  node  in 
the  network.  The  three  modules  are  the  Generalized  Tracker/Classifier 
(GTC),  the  information  fusion  module  and  the  information  distribution 
module.  Section  3  describes  the  Generalized  Tracker/Classifier  (GTC), 
which  carries  out  the  processing  of  local  sensor  data  at  each  processing 
node.  The  GTC  also  defines  hypothesis  representation,  hypothesis  forma¬ 
tion,  and  hypothesis  evaluation  for  general  multitarget  tracking  and 
classification  problems.  The  information  fusion  module  in  a  distributed 
GTC  is  described  in  Section  4;  it  contains  submodules  which  are  analo¬ 
gous  to  those  of  the  GTC  except  they  deal  with  processed  information 
from  various  nodes  instead  of  sensor  data.  Section  5  considers  informa¬ 
tion  distribution  and  problems  associated  with  general  estimation  prob¬ 
lems  in  a  network.  An  example  illustrates  some  pitfalls  which  can 
result  from  careless  information  processing  for  a  network.  Two  numeri¬ 
cal  examples  are  described  in  Section  6  to  illustrate  the  tradeoffs  of 
computation  versus  communication  and  their  effects  on  system  perfor¬ 
mance.  Specifically,  the  performance  of  centralized,  decentralized  and 


distributed  systems  as  a  function  of  various  parameters  is  considered. 
Section  7  contains  conclusions  and  suggestions  for  future  research. 

Three  appendices  contain  the  details  of  the  algorithms  described  in 
the  main  body  of  the  report.  Appendix  A  is  a  paper  on  the  theory  behind 
the  centralized  Generalized  Tracker/Classifier.  Appendix  B  presents  the 
theory  on  distributed  estimation  over  a  network.  In  particular,  it 
gives  the  distributed  fusion  formula  for  each  node  in  a  network.  Appen¬ 
dix  C  describes  a  theory  for  distributed  multitarget  tracking  and  clas¬ 
sification.  This  theory  serves  as  the  basis  for  the  information  fusion 
algorithms  used  in  each  processing  node. 


2.  SYSTEM  DESCRIPTION  AID  NODAL  ARCHITECTURE 


2.1  SYSTEM  DESCRIPTION 

In  this  section  we  describe  the  structure  of  the  system  under  con¬ 
sideration.  The  distributed  sensor  network  (DSN)  consists  of  a  collec¬ 
tion  of  processing  nodes,  each  with  one  or  more  sensors  and  a  communica¬ 
tion  network  which  connects  the  processing  nodes.  The  structure  of  the 
system  is  shown  in  Figure  2-1. 


Each  sensor  generates  measurements  from  the  targets  which  are 
within  its  f ield-of-view.  The  sensors  are  supposed  to  be  generic  and 
not  of  a  particular  type.  They  have  the  following  characteristics: 

1.  The  probability  of  detection  of  the  targets  by  a  sensor  is  less 
than  one  and  depends  on  the  relative  positions  of  the  targets  to 
the  sensor.  If  the  target  is  not  within  the  sensor's  f ield-of- 
view,  it  will  not  be  detected.  For  certain  types  of  sensors,  such 
as  the  MTI  radar,  only  targets  whose  radial  velocities  with  respect 
to  the  sensors  lie  above  a  certain  threshold  are  detected. 

2.  False  alarms  are  generated  and  correspond  to  ground  clutter,  etc. 
The  reports  from  the  sensors  may  contain  (discrete)  feature  meas¬ 
urements  as  well  as  the  usual  (continuous)  measurements  such  as 
position  and  velocity. 


Each  processing  node  collects  measurements  from  a  set  of  sensors. 

It  is  convenient  to  assume  that  the  sensor  sets  for  different  processing 
nodes  are  disjoint.  The  function  of  each  processing  node  is  to  process 


S  =  SENSOR 
P  =  PROCESSOR 


Figure  2-1:  Distributed  Sensor  Network 
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the  local  sensor  data  to  form  an  assessment  of  the  state  of  the  world, 
to  distribute  information  to  other  nodes,  and  to  combine  the  information 
obtained  from  other  nodes  with  the  local  information  to  update  its 
assessment  about  the  state  of  the  world.  The  processing  nodes  are  thus 
the  main  information  processing  units  in  the  DSN. 

The  communication  network  communicates  messages  from  one  processing 
node  to  the  other  processing  nodes.  The  actual  network  may  be  a  packet 
radio  or  other  kind  of  networks.  Our  study  has  not  gone  into  any 
details  on  the  communication  network  but  has  only  considered  it  as  a 
means  of  allowing  certain  nodes  to  share  information. 

2.2  STRUCTURE  OF  EACH  PROCESSING  RODE 

The  purpose  of  each  processing  node  is  to  integrate  the  data  from 
local  sensors  with  information  from  other  nodes  to  form  an  assessment  of 
the  state  of  the  world.  There  are  three  generic  functions  of  the  pro¬ 
cessing  node. 

-  processing  of  the  local  sensor  data 

-  information  fusion  from  other  nodes 

-  information  distribution  to  other  nodes 

These  three  functions  are  implemented  as  three  separate  modules 
within  each  processing  node.  The  structure  of  each  node  in  the  system 
is  shown  in  Figure  2-2.  The  three  modules  are  discussed  briefly  below 
and  in  detail  in  Sections  3,  4  and  3. 
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INFORMATION  FUSION 


Figure  2-2:  Structure  of  Processing  Node 


2.2.1  Local  Processing  of  the  Sensor  Data 

This  function  is  responsible  for  the  local  data  processing  before 
any  communication  with  the  other  nodes  is  carried  out.  Since  the  objec¬ 
tive  of  the  system  under  consideration  is  the  tracking  and  classifica¬ 
tion  of  multiple  targets,  this  function  will  be  a  multitarget  tracker. 

In  our  system,  it  is  called  the  Generalized  Tracker/Classifier  (GTC). 

The  GTC  forms  multiple  hypotheses,  each  consisting  of  a  collection  of 
tracks  to  explain  the  origins  of  the  measurements  in  each  data  set. 

These  hypotheses  are  then  evaluated  with  respect  to  their  probabilities 
of  being  true.  To  stay  within  the  computational  constraints  of  each 
node,  the  hypotheses  are  pruned,  combined,  clustered,  etc.  The  result 
of  this  processing  is  a  set  of  hypotheses  and  their  probabilities,  a 
collection  of  tracks  corresponding  to  possible  targets  and  the  state 
distributions  of  these  tracks.  These  quantities  together  constitute  the 
information  state  for  multitarget  tracking. 

2.2.2  Information  Fusion 

This  module  combines  the  local  information  with  information 
obtained  from  the  other  nodes  to  obtain  a  new  assessment.  The  informa¬ 
tion  from  the  local  nodes  consists  of  the  information  described  above. 
The  information  from  other  nodes  is  also  similar.  Information  fusion 
then  consists  of  the  following  steps: 

1.  Hypothesis  Formation  -  Given  a  set  of  hypotheses  from  other  nodes, 
this  submodule  generates  new  global  hypotheses.  Tracks  from  the 
hypotheses  of  different  nodes  are  associated  in  all  possible  ways, 


whether  they  correspond  to  the  same  or  different  targets. 

2.  Hypothesis  Evaluation  -  Each  of  the  hypotheses  formed  above  is  then 
evaluated  with  respect  to  its  probability  of  being  true.  The 
statistics  of  the  tracks  from  different  hypotheses  are  used  in  this 
evaluation.  For  example,  if  two  tracks  are  widely  apart  in  their 
position  or  velocity  distributions,  they  are  more  likely  to  have 
come  from  different  targets  than  the  same  target. 

3.  Hypothesis  Management  -  This  is  again  needed  to  make  computation 
feasible  within  the  available  resources. 


2.2.3  Information  Distribution 

This  module  decides  what  information  is  to  be  transmitted,  who  gets 
the  information,  and  when  it  should  be  communicated.  It  thus  specifies 
the  information  available  to  each  node  at  any  time,  i.e.,  the  informa¬ 
tion  structure  of  the  system. 


3.  GENERALIZED  TRACKER/CLASSIFIER  (GTC) 


■i 


In  this  section,  we  describe  the  Generalized  Tracker /Classifier 
(GTC)  which  is  the  module  for  processing  the  local  sensor  data  within 
each  node  in  the  DSN.  The  GTC  structure  is  shown  in  Figure  3-1  and  the 
theory  upon  which  it  is  based  has  been  described  in  more  detail  in  an 
earlier  report  [1]  and  in  Appendix  A.  A  summary  can  also  be  found  in 
[2].  The  GTC  represents  the  most  complete  theory  thus  far  available  for 
Bayesian  multitarget  tracking  and  classification.  It  can  be  shown,  as 
in  Appendix  A,  that  many  existing  algorithms  are  special  suboptimal 
cases  of  the  GTC  when  the  appropriate  approximations  are  made.  In  addi¬ 
tion,  the  GTC  can  handle  complex  situations  such  as  targets  moving  as  a 
group  and  state  dependent  detection  probabilities  which  are  not  con¬ 
sidered  in  the  existing  algorithms.  Ne  shall  first  describe  the  models 
used  in  the  GTC  and  then  the  modules  in  the  actual  tracker. 

3.1  TARGET  AND  SENSOR  MODELS 

This  section  describes  the  target  and  sensor  models.  These  models 
provide  the  mathematical  foundation  for  the  modules  of  the  tracker. 
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Off  Line  Information 
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Generalized  Tracker/Cl asslfler 


Figure  3-1:  Generalized  Tracker/Classifier 


3.1.1  Target  Model 


A  novel  feature  of  the  target  model  used  in  our  approach  is  that 
the  targets  are  modeled  together  as  one  target  system  state.  More 
specifically,  the  target  system  state  at  any  time  t  is  (X(t),  NT(t)) 
where  is  the  number  of  targets  and  X(t)  is  the  composite  state  of  all 
targets.  Information  about  the  total  number  of  targets  is  very  useful 
in  multitarget  tracking.  For  example,  if  it  is  known  that  the  maximum 
number  of  targets  is  10,  any  sensor  report  containing  more  than  10  meas¬ 
urements  would  most  likely  (unless  there  are  split  measurements)  contain 
some  false  alarms.  In  the  current  approach,  knowledge  on  the  number  of 
targets  is  viewed  as  an  integral  part  in  the  target  model.  NT  can  have 
arbitrary  probabilistic  descriptions,  but  a  particularly  useful  assump¬ 
tion  is  that  N^,  is  a  constant  and  has  a  Poisson  distribution  with  mean 

V  •  Thus, 

o 


Prob.{N„  =  n}  =  -7  exp  (-v  ) 
1  n !  o 


(3.1) 


Given  NT(t)  =  n,  the  composite  state  for  the  n  targets  in  general 
consists  of  two  parts:  a  part  corresponding  to  the  common  target  state 
such  as  the  group  position,  velocity  or  type  if  we  are  dealing  with  a 
group,  and  a  part  corresponding  to  the  individual  target  states.  This 
structure  allows  us  to  handle  complex  target  structures  such  as  targets 
moving  as  a  group. 
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In  many  applications,  the  group  part  is  absent  and  the  individual 
target  models  are  independent  and  identically  distributed  random 
processes.  The  state  of  the  it^1  target,  x^,  is  then  characterized  by 
the  initial  distribution/density 

Prob.{x^(to)  £  dx)  =  qQ(x)y(dx)  (3.2) 

and  transition  probability  density 

Prob. {x^(t+At)  £  dx|x^(t)  *  x' }  =  (x |x' )y(dx) .  (3.3) 

In  general,  x^(t)  is  an  element  in  a  hybrid  set  X,  where  the  continuous 
part  corresponds  to  position,  velocity,  etc.,  and  the  discrete  part 
corresponds  to  the  type  of  targets,  sudden  structural  changes  in  dynam¬ 
ics  (maneuvering  targets),  changes  in  operational  modes,  etc.  y  is  the 
hybrid  measure  on  X,  i.e.,  the  direct  product  of  the  usual  Lebesgue 
measure  for  the  continuous  state  space  and  the  counting  measure  for  the 
discrete  state  set.  The  usual  linear  continuous  models  assumed  in  mul¬ 
titarget  tracking  are  then  special  cases  of  this  model.  For  the  rest  of 
this  report,  we  consider  the  case  where  no  group  information  is  avail¬ 
able  and  the  target  models  are  independent  and  identically  distributed 
random  processes . 
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3.1.2  Sensor  Model 


At  a  scan  or  observation  time  t,  a  sensor  s  generates  a  data  set 
NM 

the  number  of  measurements  in  the  data 


< <y j )  ,  Nm>  t.s)  where  NM  is 


set,  and  y.  is  the  measurement  in  the  set.  Each  y.  takes  a  value  in 
J  J 

the  measurement  value  space  V  for  the  sensor  s .  V  is ,  in  general,  a 

s  s 

hybrid  set  with  measure  jj  .  The  continuous  part  of  V  corresponds  to  an 

s  s 

observed  position  or  velocity,  etc.,  while  the  discrete  part  corresponds 
to  observed  features. 


Given  a  target  system  state  ((x.(t))  ,  N  ),  the  generation  of  a 

1  i=l 

data  set  is  characterized  by  the  following  four  steps: 


•  Target  Detection:  The  set  of  detected  targets  by  sensor  s  at  fime  t 
is  a  random  subset  Ip(t,s)  of  the  target  index  set  {1,...,N^}.  It 
can  be  characterized  by  the  detection  function  FD(t,s)  which  is  the 
random  indicator  function  of  1  (t,s),  i.e.,  FD(t,s)(i)  =  1  if 
i  €Ip(t,s)  (target  i  is  detected)  and  Fp(t,s)(i)  =  0  otherwise 
(target  i  is  not  detected). 


We  assume  that 

Prob. (FD(t , s)  =  1 |x^(t ) }  =  p  (*  (t)|t.s) 


(3.4) 


with  a  common  detection  probability  function  pn(.lt,s).  Thus  the 


detection  of  a  target  i  is  a  conditionally  independent  event  deter¬ 
mined  only  by  the  target  state  x^t),  the  sensor  s  and  the  time  t. 
The  number  of  targets  detected  by  sensor  s  at  time  t  is  given  by 

ND(t,s)  =  #(ID(t,s))  (3.5) 

where  #  denotes  the  number  of  elements  in  a  set. 

•  Number  of  False  Alarms:  The  number  of  false  alarms,  N (t,s),  gen- 

FA 

erated  by  sensor  s  at  time  t  depends  only  on  the  time  and  the  sensor 
and  is  independent  of  the  target  state  or  any  other  sensor  data. 
Specifically,  its  probability  is  given  by 

Prob.{NFA(t,s)>  *  pN  (NFA(t,s) it,s).  (3.6) 

FA 

The  total  number  of  measurements  in  the  data  set  is  then  given  by 

NM(t,a)  -  ND(t,s)  +  NFA(t,s).  (3.7) 

Let  JM(t,s)  *  {1, . . . ,N^(t,s)>  be  the  set  of  measurement  indices 
from  sensor  s  at  time  t 

•  Measurement  Random  Assignment:  Given  the  set  Ip(t,s)  of  detected 
targets,  the  measurement  indices  of  the  detected  targets  are  modeled 
by  the  assignment  function  A(t,s).  This  is  a  random  function 
defined  on  I^(t,s)  and  taking  values  in  J^tt.s)  such  that  for  each 

i  €  Ip( t , 8 )  and  j  €  JM(t,s), 


j  ■  A(t,s)(i) 


(3.8) 


means  that  the  measurement  originates  from  target  i.  A  realiza¬ 
tion  of  A(t,s)  is  an  one-to-one  mapping  from  ID(t,s)  to  JM(t,s).  We 
assume  any  particular  order  of  measurements  in  the  data  set  does  not 
contain  any  information  about  the  targets,  i.e.,  it  is  completely 
random.  Thus,  given  Ijj(t,s)  and  J^Ct.s),  any  realization  of  A(t,s) 
is  equally  likely  and  its  probability  is  given  by 


Prob.{A(t,s)|NM(t,s),  ID(t,s),  X(t),  NT> 


(NM(t,s)-ND(t,s))! 

NM(t,s)! 


NFA(t,s) ! 

NM(t,s) 


(3.9) 


•  Measurement  Values:  Given  the  set  of  detected  target  indices 
ID(t,s),  the  set  of  measurement  indices  JM(t,s)  and  the  random 
assignment  function  A(t,s),  the  measurement  (value)  y^t  s)(i)  or^“ 
ginating  from  a  detected  target  i  is  conditionally  independent  of 
the  other  measurement  values  and  depends  only  on  the  target  state 
x^(t),  the  sensor  s  and  the  time  t,  i.e., 

Prob‘<yA(t6)(i)  €  dy |A(t , s ) ,  X(t),NT> 

"  Pm(yA(t,s)(i)|xi(t)*t’s)lis(dy)  (3'10) 

where  Pm( • I x^( t ) , t , s)  is  the  common  probability  density  function. 


Given  the  set  JFA(t,s)  of  false  alarms,  each  measurement  value 


.th 


yj  for  the  j  false  alarm  in  JpA(t,s)  is  completely  independent  (of 
each  other  and  of  the  targets)  and  has  a  common  probability  density 


The  main  feature  of  this  sensor  model  is  its  generality.  The 
number  of  measurements,  as  well  as  the  actual  measurement  values  them¬ 
selves,  are  considered  an  integral  part  of  the  sensor  report.  Further¬ 
more,  pD(x|t,s),  the  detection  probability  of  a  target,  not  only  depends 
on  the  time  t  and  the  sensor  s,  but  is  also  a  function  of  the  state  of 
the  target.  This  state  dependence  is  particularly  useful  when  there  is 
masking  of  the  targets  or  when  sensor  detection  depends  on  the  radial 
velocity  of  the  target,  as  in  a  MTI  radar. 

3.2  HYPOTHESIS  FORMATION 

Hypothesis  formation  is  the  first  step  in  the  GTC  operation.  It 
forms  the  feasible  associations  of  data  from  different  times  and  dif¬ 
ferent  sensors. 

3.2.1  Tracks  and  Hypotheses 

Since  the  origins  of  the  measurements  in  each  sensor  report  or  data 
set  are  uncertain,  one  of  the  crucial  steps  in  multitarget  tracking  is 
the  formation  of  the  data-to-data  association  hypothesis,  or  simply  the 
hypothesis.  Each  hypothesis  corresponds  to  a  possible  explanation  of 
the  origins  of  the  measurements.  These  hypotheses  would  then  be 
evaluated  with  respect  to  their  probabilities  of  being  true  in  the  later 


We  can  index  each  data  set  by  k  =  (t,s),  the  time  t  when  it  is  gen¬ 
erated  and  the  sensor  s  reporting  the  data.  Each  data  set 


((y-)  ,  N  ,  t,s)  can  then  be  represented  by  z(k).  Let  K  be  the  col- 

J  j=l 

lection  of  all  the  data  set  indices,  called  the  data  set  index  set .  The 
order  in  which  the  data  sets  arrive  define  a  natural  lexicographic  order 
<  on  K. 


Consider  the  cumulative  measurement  index  set  at  time  k  defined  as 


4k)  =  U  JM(k')  x  {k'}. 
M  k'  <k  M 


Each  element  (j,k)  =  (j,t,s)  in  represents  the  j1"*1  measurement  in 
the  data  set  from  sensor  s  at  time  t.  Our  objective  is  to  explain  the 
origin  of  each  of  these  elements. 


According  to  the  sensor  model,  the  uncertainty  in  the  measurements 
for  the  detected  targets  is  due  to  the  random  assignment  A(k).  For  each 
target  i,  the  set  of  measurement  indices  originating  from  i  is  given  by 


T(i)  =  {(A(k) (i) ,k) Ik  €  K,i  €  IpCk)). 


(3.11) 


Since  Ip(k)  and  A(k)  are  random,  each  T(i)  is  a  random  set.  Let 


A  =  (T(  i) I i £  U  I  (k) } 
k  £  K  U 


{T  ( i )  I  i  £  {l,...,NT),T(i)  *  4>) 


(3.12) 


Then  A  is  a  random  set  identifying  the  measurement  for  all 
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the  detected  targets.  At  any  k,  the  restriction  of  A  to  k  is  defined  as 


Ajk  =  {T  nj*k)|T  €  Aj^jW  +  ♦>,  (3.13) 

which  identifies  the  measurement  indices  for  each  of  the  targets 
detected  up  to  k. 

For  each  target  i,  T(i),  its  measurement  indices,  has  realized 

(k) 

values  which  are  subsets  of  Ju  .  Each  of  these  subsets  is  called  a 

M 

track  at  k  and  denoted  by  T.  Since  we  assume  that  a  target  can  generate 
at  most  one  measurement  in  a  data  set,  then  the  set  of  possible  tracks 
are  those  containing  at  most  one  measurement  index  from  each  data  set. 
Let  the  set  of  possible  tracks  at  k  be  T( k) . 

A  data-to-data  association  hypothesis  A  (henceforth  called  a 
hypothesis)  at  k  is  a  (possibly  empty)  collection  of  non-empty  tracks. 

A  hypothesis  A  is  thus  a  particular  realization  of  the  random  set  A^. 
Again,  since  we  assume  no  merged  measurements,  then  in  the  set  of  possi¬ 
ble  hypotheses,  each  hypothesis  cannot  have  intersecting  tracks.  Let 
H(k)  be  the  6et  of  all  possible  hypotheses  at  k. 

Thus,  H(k)  is  the  set  of  all  possible  realizations  for  the  random 
set  A|k,  i.e.,  H(k)  consists  of  all  possible  explanations  of  the  origins 
of  the  measurements  in  the  data  sets  up  to  k.  An  event  (A =  A}  means 
that : 
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1.  #(x)  targets  have  been  detected  and  included  in  at  least  one  of  the 
data  sets  prior  to  and  including  k; 

2.  each  track  T  in  X  corresponds  uniquely  to  a  target  detected  prior 
to  or  at  k; 

t'  h 

3.  for  any  k'  <  k, (j',k')  £  T  implies  that  the  j  1  measurement  in  the 
data  set  k'  originates  from  the  target  identified  by  T; 

4.  if  the  intersection  of  t  and  J^(k' )  x  {k' }  is  empty,  then  the  tar¬ 
get  is  falsely  dismissed  in  k' ; 

5.  any  measurement  indices  not  in  U  X  are  false  alarms. 

Similar  to  (3.13)  for  each  k  in  K  and  each  k'  <  k,  we  can  define 

( v' ) 

the  restriction  of  T  to  '  (or  simply  to  k' )  as 

T *  =  T|r.  =  T  D  J^>.  (3.14) 

The  restriction  of  X  to  k'  can  be  defined  similarly  as 

X'  =  X)k,  «  {xjk,  |t  €  X>\«t».  (3.15) 

In  the  above  T'  and  X'  are  called  the  predecessors  of  T  and  Xt  respec- 


3.2.2  Tree  Representation 


Although  the  hypotheses  are  defined  as  collections  of  tracks  which 
are  in  turn  collections  of  measurement  indices,  an  equivalent  represen¬ 
tation  is  by  means  of  a  tree.  In  this  representation,  each  level  in  the 
tree  corresponds  to  a  measurement  index  and  a  node  corresponds  to  a  tar¬ 
get.  Hypothesis  formation  given  a  new  sensor  report  then  reduces  to 
the  expansion  of  the  tree  and  a  branch  of  the  tree  represents  a  particu¬ 
lar  data-to-data  association  hypothesis.  The  concept  of  the  predecessor 
of  a  hypothesis  is  obvious  from  this  representation. 

Figure  3-2  shows  a  hypothesis  tree  for  two  data  sets  with  two  meas¬ 
urements  in  each.  The  tracks  associated  with  each  hypothesis  are  also 
given.  For  example,  hypothesis  24  associates  y^  and  y^  with  the  same 

target  (track  5),  and  y*  with  a  different  target  (track  2).  It  thus 
2 

hypothesizes  y^  to  be  a  false  alarm.  Note  that  from  two  data  sets  with 
two  measurements  each,  we  have  eight  possible  tacks  and  a  total  of  34 
possible  hypotheses. 

3.3  HYPOTHESIS  EVALUATION 

Hany  data-to-data  association  hypotheses  are  generated  by  the 
Hypothesis  Formation  Module.  In  order  to  rank  these  hypotheses,  the 
Hypothesis  Evaluation  Module  evaluates  the  probability  of  each 
hypothesis.  This  evaluation  is  based  on  the  target  models,  sensor 
models  and  the  measurement  values.  For  general  target  models,  the 
evaluation  formula  can  be  found  in  Appendix  A.  The  following  describes 


the  evaluation  scheme  for  independent  and  identically  distributed  target 
models,  which  are  the  emphasis  in  this  report. 

(k) 

Let  Z  be  the  cumulative  data  set  at  k,  i.e., 

“  (z(k')lk'  <  k),  and  X  be  a  hypothesis  defined  on  J^k\  We  would 
like  to  evaluate  Prob.(A|k  =  X|z^k^}  =  P(X|z^k^).  Let  k'  be  the  immedi¬ 
ate  predecessor  of  k,  the  latest  data  index  and  assume  Prob.  (X^,  |z^k  ^ } 
is  known.  Suppose  z(k)  *  (y,m,k).  Then  each  hypothesis  can  be 
evaluated  as 

P(X|Z(k))  -  c“1P(X|k,  |2(k'  5)  LFA(k,X)  JMv(k'))€(T)  I^Mx.kM)  (3.16) 

where  C  is  a  normalization  constant  and  the  L's  are  likelihood  func¬ 
tions.  The  False  Alarm  Likelihood  Function  is: 

LFA(k,X)  -  nFA(X|k)l  pM  (nFA(X|k))  II  PFA^yi*k^  (3.17) 

FA  j€jFA(X,m|k)  J 

where  n*,. (XJk)  is  the  number  of  false  alarms  in  Z(k)  according  to  X,  and 
FA 

j„.(X>m|k)  is  the  set  of  false  alarms  in  z(k)  according  to  X. 

FA 

The  Track-Measurement  Likelihood  Function  is : 

(v(k'))€(T)  Lk(Y(T,k),T)  (3.18) 
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where 


v(k')  =  the  expected  number  of  targets  which  are  not  detected 

up  to  or  at  k'i 

£(t)  =  1  if  t  is  a  new  track  in  z(k);  0  otherwise, 

Y(x,k)  =  the  measurement  for  track  x  in  the  data  set  k;  (if  track 
T  is  missed  in  k,  Y(x,k)  =  0) 


Lk(y»T)  =  /gk(y|x)^k)  (x)y(dx)  (3.19) 

X 

p^k^ (x)  =  p(x(t) | Z (k * ) ,t)  is  the  density  of  x(t)  given  Z(k')  and 
track  x,  and 


gk(y|x) 


PM(yW,k)pD(x|k) 
1-pD(x  k) 

* 


if  y  +  6 


if  y  =  0 


(3.20)  e 


There  are  thus  three  types  of  track-measurement  likelihood  func¬ 
tions  to  be  evaluated. 


1 •  the  likelihood  L^(Y(T,k),x)  of  measurement  Y(T,k)  ^  <P  originating 
from  a  previously  detected  target  (X|k,  ^  4>); 

2.  the  likelihood  L^(Y(x,k),x)  of  a  previously  detected  target 
(x,k,  ?  <t>)  being  undetected  (Y(x,k)  =  9)  and, 

3.  the  likelihood  v(k' )Lk(Y(T,k) , t)  of  a  measurement  Y(T,k)  ^  0  ori¬ 
ginating  from  a  newly  detected  target  (T|k'  =  4>). 


In  addition,  1.^(0, <J>),  the  likelihood  of  an  undetected  target  remaining 
undetected  is  also  used.  The  evaluation  formula  is  very  general  and 
reduces  to  the  standard  ones  used  in  multitarget  tracking  when  the 


appropriate  approximations  are  made. 


3.5  HYPOTHESIS  MANAGEMENT 


The  hypothesis  management  module  controls  the  growth  in  the  number 
of  hypotheses  and  makes  the  implementation  of  the  GTC  feasible.  It  is 
model-independent  in  the  sense  that  the  techniques  involved  are  applica¬ 
ble  to  a  wide  class  of  scenarios.  The  user,  however,  should  select  cer¬ 
tain  parameters  to  conform  with  the  computational  and  memory  require¬ 
ments  or  to  reflect  his  knowledge  about  the  complexity  of  the  situation. 

No  general  theory  on  hypothesis  management  techniques  exists  at  the 
present  moment.  The  purpose  of  this  section  is  to  summarize  some  exist¬ 
ing  techniques  and  describe  any  modifications  to  such  techniques  that 
have  been  adopted  in  our  work.  We  divide  the  hypothesis  management 
techniques  into  the  following  four  classes. 

1.  Pruning  ....  cutting  branches 

2.  Combining  ....  binding  branches  together 

3.  Windowing  ....  data  validation 

4.  Clustering  ....  data  partition 


In  the  following,  we  discuss  the  techniques  according  to  the  above 
classification.  Although  some  of  the  techniques  described  below  may 
apply  to  the  general  target  models,  for  the  most  part  we  restrict  our 
attention  to  the  i.i.d.  target  case. 


3.5.1  Pruning 


Pruning  techniques  can  be  further  classified  into  (1)  thresholding, 
(2)  breadth  control,  and  (3)  adaptive  pruning  techniques. 

The  basic  philosophy  behind  thresholding  is  to  cut  (or  remove)  the 
"insignificant.”  hypotheses  which  in  turn  tend  to  produce  more  insignifi¬ 
cant  hypotheses.  In  [5],  it  is  proposed  to  cut  any  hypothesis  with  pos¬ 
terior  probability  less  than  a  fixed  predetermined  threshold.  In  the 
i.i.d.  target  case,  thresholding  may  be  performed  at  the  track  level 
using  the  track  likelihood  functions.  One  of  the  disadvantages  of  this 
fixed  thresholding  is  that  it  is  performed  without  considerating  the 
available  computational  resources  or  the  external  condition  (e.g.,  clear 
versus  confusing,  etc.).  For  example,  given  the  same  computational 
resources,  one  should  be  able  to  keep  more  hypotheses  for  a  small  amount 
of  data  than  for  a  large  amount  of  data.  This  adaptivity,  however,  is 
not  present  in  the  fixed  thresholding  scheme. 

This  consideration  leads  to  the  second  subclass  of  breadth  control 
techniques  in  which  a  fixed  number,  say  M,  of  the  best  hypotheses  are 
chosen  and  propagated  forward.  This  technique  is  proposed  by  Keverian 
in  [6].  Choosing  a  fixed  breadth  M  makes  sense  when  we  regard  the 
number  of  hypotheses  kept  as  a  measure  of  the  computational  and  memory 
requirements.  However,  fixed  breadth  control  may  deviate  from  its  ori¬ 
ginal  rationale  quickly  when  some  form  of  clustering  is  used  since  the 
resources  cannot  be  efficiently  allocated  among  the  clusters.  Also,  the 
breadth  control  or  fixed  breadth  method  requires  a  sorting  algorithm 
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which  requires  additional  effort  (although  this  issue  may  not  be  impor¬ 
tant  since  many  good  sorting  algorithms  do  exist).  When  breadth  control 
is  used  extensively  to  its  limits,  only  one  (best)  hypothesis  is 
selected  and  propagated  forward.  In  l 5},  this  model  of  pruning  is 
called  a  zero-scan  algorithm. 

Although  fixed-threshold  pruning  may  be  viewed  as  adaptive 

» 

breadth-control  pruning,  and  vice  versa,  these  techniques  do  not  really 
adapt  to  the  complexity  of  the  situation.  The  third  subclass  of  pruning 
techniques  introduced  in  this  project  is  called  adaptive  pruning.  In 
this  strategy,  the  hypotheses  are  first  sorted  in  descending  order  of 
their  posterior  probabilities.  Then,  when  the  cumulative  sum  of  the 
probabilities  exceeds  a  given  threshold,  the  remaining  low  probability 
hypotheses  are  pruned.  This  method  may  be  called  adaptive- 
threshold/adaptive-breadth  pruning  since  it  adjusts  both  the  absolute 
threshold  and  the  breadth  according  to  the  complexity  of  the  external 
condition,  i.e.,  the  more  complex  the  situation  is  the  more  low  proba¬ 
bility  hypotheses  are  retained.  This  adaptive  pruning  technique  makes 
more  sense  than  other  pruning  methods  when  clustering  is  used  and  may  be 
viewed  as  a  way  of  automatically  allocating  computational  and  memory 
resources  among  the  clusters.  However,  it  still  suffers  from  the  same 
drawback  of  any  fixed  thresholding  scheme  in  that  the  actual  (absolute) 
computational  and  memory  resources  cannot  be  predicted.  Furthermore, 
some, form  pf  sorting  is  still  needed.. 
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From  a  theoretical  point  of  view,  the  posterior  probabilities  of 
hypotheses  may  be  considered  as  a  discrete  distribution.  Hypothesis 
pruning  may  then  be  viewed  as  picking  the  approximation  techniques  foi 
the  distribution.  Figure  3-3  displays  the  approximation  involved  in  the 
three  schemes.  Thus  the  theoretical  issues  in  hypothesis  pruning  are 
what  a  good  approximation  should  be  and  how  it  influences  the  future 
evaluation  of  multitarget  tracking.  Although  some  theories  on  the 
approximation  of  probability  distributions  (e.g.,  Sorenson  and  Alspach 
[7])  may  give  us  some  insight,  we  believe  that  hypothesis  pruning  is 
still  an  open  research  area. 


3.5.2  Combining 

The  existing  combining  techniques  can  be  divided  into  two  subc¬ 
lasses:  (1)  distribution-oriented  techniques,  and  (2)  measurement- 
index-oriented  techniques. 

The  philosophy  behind  the  first  subclass  is  to  combine  two  similar 
hypotheses,  where  similarity  is  interpreted  in  a  certain  way.  According 
to  Reid  [5],  two  hypotheses  are  similar  if  they  have  the  same  number  of 
tracks  and  each  track  in  one  hypothesis  has  a  unique  companion  track 
which  is  similar  to  it  in  the  other  hypothesis.  The  similarity  of 
tracks  is  measured  by  the  state  estimate  distributions,  which  accounts 
for  the  name  of  distribution-oriented  techniques .  The  rationale  behind 
this  approach  is  that  each  track  state  distribution  should  reflect  all 


the  relevant  information  which  affects  any  future  event  due  to  the 
underlying  Markovian  assumptions.  Thus,  if  two  state  distributions  of 
tracks  are  close  enough,  we  would  expect  the  future  behavior  of  the  two 
tracks  to  be  similar. 


Suppose  two  hypotheses  A^  and  ^  are  similar,  where 
*  {Tj,...,  t*},  i  =  1,  2.  Then  hypothesis  combining  leads  to  a  new 
hypothesis  A“  >  with 


p(  A 1 Z )  =  pUJZ)  +  p(A2|Z) 


(3.24) 


and  for  j  =  1 , . . .  ,n 


P  (  X  I  T  j  ,  Z  ) 


ptAjlZjptxIZ.T1.)  +  p(X2 |Z)p(x|Z,T^(  . )) 

p(A1|Z)  +  p(X2|Z) 


(3.25) 


where  !!(•)  is  a  permutation  which  maps  a  track  into  a  similar  track. 


However,  there  still  remains  the  crucial  question  of  choosing  a 
good  measure  of  "similarity,"  and  a  good  threshold  for  that  measure. 

When  each  track  distribution  is  Gaussian,  Reid  [5]  proposes  inequality 
tests  using  the  means  and  the  diagonal  elements  of  the  covariance 
matrices.  However,  no  theoretical  justification  for  the  use  of  those 
particular  inequalities  is  given.  His  intuitive  reason  is  that  for 
tracks  to  be  similar,  both  their  means  and  their  variances  should  not  be 
widely  different.  This  test  may  work  well  for  Gaussian  distributions, 
but  cannot  be  applied  to  more  general  distributions. 


Distribution-oriented  combining  is  used  to  its  extreme  in  [8]  and 
[9]  where  all  the  hypotheses  are  combined  after  proper  windowing 
(described  below).  This  is  only  possible  when  a  fixed  number  of  targets 
are  assumed,  i.e.,  every  hypothesis  has  the  same  number  of  tracks.  When 
two  Gaussian  distributions  are  combined,  the  combined  distribution 
becomes  a  Gaussian  sum  distribution  because  two  different  hypotheses 
represent  two  disjoint  events.  When  a  Gaussian  sum  distribution  is 
approximated  by  a  Gaussian  distribution,  the  means  and  the  variances  are 
usually  equated.  Unlike  the  results  in  [5]  or  [8],  the  Gaussian  sum 
form  is  preserved  to  a  certain  extent  in  [10]  where  each  track  distribu¬ 
tion  remains  a  Gaussian  sum  rather  than  Gaussian.  In  this  case,  the 
hypothesis  trees  are  extended  to  include  one  lower  level,  namely  the 
distribution  level.  The  hypothesis  management  (pruning  and  combining) 
techniques  must  then  be  extended  to  include  this  level. 

In  summary,  unless  each  track  distribution  is  assumed  to  be  and 
forced  to  be  Gaussian,  the  similarity  criteria  proposed  in  [5],  [8], 
etc.,  may  sometimes  be  unjustified.  Theoretical  results  on  similarity 
criteria  are  still  lacking  in  our  opinion. 

On  the  other  hand,  the  measur ement-index-oriented  combining  tech¬ 
niques  consider  each  track  as  a  subset  of  the  past  cumulative  measure¬ 
ment  index  set.  This  technique  has  been  proposed  by  Singer,  et.  al., 
[11]  and  is  a  classical  technique  in  the  multitarget  tracking  litera¬ 
ture.  In  these  schemes,  the  tracks  whose  measurement  indices  on  the 
past  N  scans  are  the  same  are  regarded  as  "similar"  and  identified. 


Thus  they  are  often  referred  to  as  N-scan  or  depth-N  methods.  In  Figure 
3-4,  hypotheses  1  and  2  can  be  combined  if  N  =  3. 

The  justification  is  that  since  each  track  distribution  is  driven 

by  the  measurements  assigned  to  it,  if  two  tracks  share  the  same  meas- 

• 

urements  in  the  recent  scans  (data  sets)  they  should  be  similar.  This 
scheme  is  criticized  by  Reid  [5]  on  the  ground  that  some  events  in  the 
past  may  have  a  greater  influence  than  the  most  recent  N  scans.  How¬ 
ever,  since  the  Markovian  nature  of  a  target  model  removes  this  possi¬ 
bility,  the  N-scan  approach  is  attractive  because  of  its  simplicity. 

After  identifying  tracks  according  to  the  N-scan  or  depth-N  cri¬ 
terion,  we  may  have  several  identical  hypotheses,  i.e.,  hypotheses  with 
the  identical  set  of  tracks.  Then  those  hypotheses  are  combined  in  a 
natural  way.  In  a  sense,  this  approach  may  be  actually  viewed  as  com¬ 
bining  tracks  rather  than  combining  hypotheses.  In  fact,  since  similar¬ 
ity  is  initially  defined  at  the  track  level  even  in  distribution- 
oriented  methods,  one  may  further  classify  the  combining  techniques 
according  to  where  combining  takes  place.  For  example,  distribution- 
oriented  combining  may  be  performed  at  the  track  level  or  at  a 
hypothesis  level.  While  track-level  combining  may  seem  to  be  more 
straightforward,  it  creates  the  problem  of  how  two  distributions  should 
be  combined,  since  there  is  no  natural  weighting  formula  (similar  to 
that  of  (3.25))  used  in  distribution-oriented  combining  at  the 
hypothesis  level. 


To  summarize,  N-scan  or  depth-N  methods  have  two  major  disadvan¬ 
tages,  namely,  the  unresolved  issues  of  (1)  how  to  choose  a  right  depth 
N,  and  (2)  how  to  combine  track  state  distributions.  Just  as  in  the 
case  of  pruning,  many  theoretical  questions  remain  in  combining 
hypotheses.  Our  current  preference  is  distribution-oriented  combining 
at  the  hypothesis  level  since  there  is  a  clear  way  for  combining  two 
probability  distributions.  The  similarity  condition,  however,  should  be 
carefully  chosen  according  to  the  physical  nature  of  the  particular 
problems  and  the  chosen  representation  of  each  track  distribution,  etc. 

3.5.3  Windowing 

When  i.i.d.  target  models  are  used,  each  measurement  in  a  data  set 
can  be  individually  evaluated  by  likelihood  functions.  When  a  track 
state  distribution  has  a  reasonable  variance  and  the  measurement  errors 
are  not  exceptionally  large,  one  can  expect  the  track-measurement  likel¬ 
ihood  to  be  very  small  for  a  measurement  which  is  geometrically  far  from 
the  expected  position  based  on  the  state  distribution  associated  with 
the  track.  Windowing  techniques  are  generally  designed  to  set  an 
appropriate  threshold  so  that  the  track-measurement  likelihood  in  such  a 
case  becomes  zero  rather  than  a  small  positive  number. 

Thus,  one  may  consider  such  techniques  to  be  a  special  kind  of 
pruning,  i.e.,  immediate  pruning  of  branches  based  solely  on  one  likeli¬ 
hood  function.  In  other  words,  windowing  is  a  method  for  preventing  all 
but  a  certain  set  of  data  from  being  even  tentatively  associated  with 
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each  track.  For  this  reason,  such  a  process  is  often  called  data  vali¬ 
dation.  When  track  state  distributions  and  measurement  error  distribu¬ 
tions  are  both  Gaussian,  windowing  can  be  accomplished  by  a  a  classical 
chi-square  test.  As  described  in  [12],  this  test  may  be  performed  in 
several  steps.  For  example,  the  first  step  may  consist  of  test 
(square  test),  and  then  a  normalized  square-of-innovation  test  (ellipse 
test),  and  finally,  the  likelihood  function  test  itself. 

Another  view  of  windowing  is  that  it  is  part  of  the  distribution 
representation  and  modeling  process.  According  to  this  view,  when  the 
track  state  and  measurement  distributions  are  modeled  as  Gaussian,  they 
really  are  approximations  of  reality  since  such  distributions  can  only 
have  compact  supports  in  the  real  world.  For  example,  when  the  standard 
deviation  of  the  measurement  error  is  one  mile,  a  data  point  100  miles 
away  from  the  mean  of  the  track  distribution  should  yield  zero  as  its 
likelihood  rather  than  a  very  small  but  positive  number.  We  prefer  this 
point  of  view  to  the  pruning  or  approximation  view.  Thus  any  windowing 
process  should  be  carefully  designed  to  reflect  the  particular  physical 
nature  of  the  problem. 

3.5.4  Clustering 

The  basic  idea  behind  clustering  is  that  two  events  taking  place  at 
locations  far  apart  should  be  independent  and  can  be  evaluated 
separately.  Clustering  techniques  have  been  described  in  algorithmic 
form  in  [5]  for  general  cases  and  more  rigorously  in  [8]  for  a  special 


case.  When  adequate  windowing  is  performed,  there  is  a  natural  way  to 
avoid  redundant  calculations  in  evaluating  hypotheses  since  the  poste¬ 
rior  probability  of  each  hypothesis  is  the  product  of  an  appropriate  set 
of  likelihood  functions.  This  constitutes  another  view  of  clustering. 

Mathematically,  clustering  can  be  defined  as  follows.  Let  H  be  the 
set  of  all  non-zero-probability  hypotheses  at  a  given  data  set,  i.e., 
for  all  A  £  H, 

p(X|Z)  >  0.  (3.26) 

For  each  possible  track  t,  the  posterior  probability  of  T  is  given  by 

p(T|Z)  =  L  p(A|Z).  (3.27) 

t€X£H 

Let  T  be  the  set  of  all  non-zero-probability  tracks,  i.e.,  for  all  t£T, 

p(T IZ)  >  0.  (3.28) 

Let  C  be  any  partition  of  T  which  satisfies  the  following  condition: 

For  any  pair  (C',C")  of  elements  in  C,  such  that  C'  t  C  ,  and  any 
r  £  C'  and  T"  €  C”: 

T*  H  t"  =  <{>.  (3.29) 

This  condition  means  that  the  non-zero-probability  tracks  are  parti¬ 
tioned  so  that  no  two  tracks  which  are  in  different  elements  (clusters) 
of  the  partition  share  a  common  measurement  index.  For  each  C  in  C, 
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let: 


Hc  =  (A  nclA  €  H>  (3.30) 

and 

Ic  -  U  C.  (3.31) 

Then  clustering  is  the  process  of  generating  {(I^,,HC)|  C  6  C).  Each  Hc 
is  the  set  of  local  hypotheses  which  consists  only  of  tracks  in  C.  For 
each  cluster  C  €  C  and  each  local  hypothesis  A^  in  H^,  define  the  local 
posterior  probability  p^  by 

PC(XCIZ)  -  Z  {p(X|Z)|X  €  H,  ARC  -  Xc>.  (3.32) 

Then  it  is  clear  that 

p(A|Z)  -  n  pc(XcIZ)  (3.33) 

C  €  C 


Each  global  non-zero-probability  hypothesis  can  thus  be  represented 
as  a  union  of  local  hypotheses  (one  from  each  cluster)  and  its  posterior 
probability  is  the  product  of  the  local  probabilities  of  the  local 
hypotheses.  From  this  definition,  we  see  that  clustering  involves  par¬ 
titions  at  all  levels:  hypotheses,  tracks  and  measurements. 

Equation  (3.33)  is  called  the  orthogonality  condition.  According 
to  the  above  definition  of  clustering,  the  orthogonality  condition 
should  hold  whenever  the  non-intersection  condition  of  the  tracks  given 
by  (3.29)  holds.  However,  when  some  approximation  techniques  such  as 
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pruning  and  combining  are  employed,  orthogonality  condition  may  not 
hold.  The  clustering  technique  described  in  [5]  is  a  method  in  which 
the  orthogonality  condition  is  maintained  without  checking  the  non¬ 
intersection  condition  of  the  tracks.  This  technique  can  be  described 
in  terms  of  algorithmic  procedures  as  follows: 

1.  Initialization  of  a  Cluster  -  Whenever  there  is  a  measurement  such 
that  the  newly  detected  target  likelihood  is  not  zero  but  the 
track-measurement  likelihood  with  every  existing  track  is  zero,  a 
new  cluster  should  be  created  out  of  the  measurement. 

2.  Cluster  Merging  -  Whenever  there  is  a  measurement  such  that  each  of 
the  corresponding  track-measurement  likelihood  functions  with  two 
or  more  tracks  in  different  clusters  is  non-zero  (in  other  words, 
there  is  a  measurement  lying  in  the  intersection  of  the  validation 
regions  of  two  tracks  in  two  different  clusters),  such  clusters 
should  be  merged  before  the  measurement  can  be  processed.  The 
merging  of  the  clusters  is  accomplished  by  forming  the  union  of  the 
tracks  in  the  clusters,  generating  the  global  hypotheses  and 
evaluating  the  global  probabilities  as  the  products  of  the  local 
probabilities. 

3.  Cluster  Splitting  -  Whenever  a  track  with  probability  one,  i.e., 
one  contained  in  every  local  hypothesis  in  a  cluster  is  found,  it 
is  split  to  form  a  new  cluster  consisting  of  one  hypothesis  with 
the  sole  track.  Of  course,  the  local  probability  of  such 
hypothesis  is  one. 
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We  believe  Chat  any  serious  attempt  on  practical  problems  should 
not  be  made  without  clustering,  particularly  in  a  problem  involving  a 
large  geographic  area.  One  may  even  assert  that  clustering  is  probably 
the  most  powerful  hypothesis  management  technique  in  controlling  the 
number  of  hypotheses.  Of  course,  how  successful  a  clustering  technique 
is  depends  on  the  external  conditions  such  as  target  density,  measure¬ 
ment  errors  and  target  dynamics. 

The  clustering  procedure  described  above  does  not  necessarily 
guarantee  the  finest  clustering.  The  finest  clustering  may  be  found 
according  to  our  mathematical  definition  of  clustering.  The  test  of  the 
non-intersection  condition  can  be  easily  implemented  if  we  identify  two 
tracks  with  the  same  measurement  indices  in  a  certain  number  of  the  most 
recent  scans  just  like  in  a  measurement-index-oriented  combining  scheme. 
The  orthogonality  condition  can  be  met  by  modifying  the  local  probabili¬ 
ties  using  appropriate  approximation  techniques  whenever  possible.  This 
constitutes  a  new  cluster-splitting  technique  which  may  improve  on  that 
described  above.  Although  this  newly  proposed  version  of  clustering 
seems  promising,  we  do  not  have  any  actual  implementation  experience 


4.  INFORMATION  FUSION  IN  DISTRIBUTED  GENERALIZED  TRACKER/CLASSIFIER 


Each  node  in  the  DSN  receives  information  from  other  nodes.  This 

information  has  to  be  fused  with  the  local  information  to  obtain  an 

improved  assessment  of  the  state  of  the  world.  In  this  section  we  shall 

discuss  the  fusion  carried  out  at  each  node.  While  many  ways  of  fusion 

are  possible,  our  work  is  based  on  the  following  philosophy: 

Sufficient  statistics  for  multitarget  tracking  are  communicated 
through  the  network.  Upon  receiving  these  sufficient  statistics, 
each  node  attempts  to  reconstruct  the  global  sufficient  statistics 
that  would  be  available  if  the  actual  sensor  data  sets  were  commun¬ 
icated. 

We  shall  present  the  distributed  version  of  the  Generalized 
Tracker/Classifier  of  Section  3.  Our  discussion  in  this  section  will  be 
restricted  to  broadcast  communication,  where  each  node  broadcasts  its 
results  to  all  the  other  nodes  periodically.  The  derivations  of  these 
results,  as  well  as  their  generalizations  to  more  complex  situations, 
are  given  in  Appendix  C. 


4.1  PROBLEM  STATEMENT 

In  this  section  we  state  the  distributed  multitarget  tracking  and 


classification  problem. 


4.1.1  Models 


The  target  and  sensor  models  are  the  same  as  those  discussed  in 
Section  3.  Our  emphasis  is  still  on  independent  and  identically  distri¬ 
buted  target  models.  The  system  now  consists  of  a  set  of  processing 
nodes  called  N.  Each  node  n  in  the  set  processes  the  measurements  from 

the  set  of  sensors  called  S  .  We  assume  that  the  sensor  sets  for  dif- 

n 

ferent  nodes  are  disjoint,  i.e.,  S  H  S  .  =  d>  for  n  #  n' .  On  the  other 

n  n 

hand,  the  sensors  of  different  nodes  may  have  overlapping  coverage. 

In  a  general  DSN,  the  nodes  may  communicate  in  many  different  ways. 
In  this  section,  we  restrict  our  attention  to  the  broadcast  type  commun¬ 
ication  (more  general  communication  for  distributed  estimation  systems 
will  be  discussed  in  Section  5).  The  processing  nodes  communicate  with 
each  other  at  various  times  (the  times  need  not  be  periodic  and  the  com¬ 
munications  need  not  be  synchronized).  When  messages  are  broadcasted 
and  received,  each  node  in  the  network  then  updates  its  assessment  on 
the  state  of  the  world. 

We  assume  that  between  broadcast  times,  each  node  receives  a  large 
amount  of  data  from  the  senors.  Thus  it  is  more  efficient  for  the  nodes 
to  process  the  local  data  first  before  communication.  In  particular, 
each  node  broadcasts  a  set  of  possible  hypotheses  and  the  probability  of 
each  hypothesis,  a  set  of  possible  tracks  and  the  state  distribution  of 
each  track,  and  the  expected  number  of  undetected  targets  based  on  the 
local  information.  These  quantities  from  various  nodes  are  then  to  be 
integrated  or  fused  to  obtain  a  better  estimate.  In  order  to  define  the 


problem  properly,  we  need  Co  generalize  the  definitions  of  tracks  and 
hypotheses  introduced  in  Section  3. 

4.1.2  Tracks  and  Hypotheses 

Let  K  be  the  data  set  index  set  as  defined  in  Section  3,  i.e.,  the 
set  of  all  data  set  indices  k  =  (t,s).  For  e^'H  k  in  K,  the  data  set 
and  the  measurement  index  set  are 

Vk) 

z(k)  =  ((y.(k))  ,N  (k),k)  (4.1) 

1  .  i  H 

j“i 

and 

JM(k)  x  (k)  *  {1, . . .,NM(k))  x  {k>.  (4.2) 

The  set  of  all  data  sets  is  then 

Z  *  U  z(k)  (4.3) 

k  €  K 

and  the  set  of  all  measurement  indices  is 

J  -  U  JM(k)  x  {k>.  (4.4) 

k  £  K  ” 

Since  each  processing  node  generates  hypotheses  and  tracks  based  on 
different  information,  we  now  generalize  the  definitions  of  tracks  and 
hypotheses  in  Section  3  so  that  they  can  be  defined  on  arbitrary  subsets 
of  J..  Let  J  be  any  subset  of  J..  Then  a  track  on  J  is  a  (possibly 
empty)  subset  of  J  and  a  possible  track  is  one  which  contains  at  most 


one  measurement  index  in  each  single  data  set.  A  (data-to-data  associa¬ 
tion)  hypothesis  on  J  is  a  (possibly  empty)  collection  of  tracks  on  J 
and  a  possible  hypothesis  is  one  containing  only  nonempty  possible 
tracks  such  that  no  two  distinct  tracks  intersect.  Let  T(J)  and  H(J)  be 
the  sets  of  possible  tracks  and  possible  hypotheses  respectively. 


Since  there  is  a  unique  correspondence  between  the  elements  in  the 
sets  J.,  Z  and  K,  we  may  call  a  track  (or  hypothesis)  on  J  C  a  track 
(or  hypothesis)  on  K  or  Z  if  K  C  K  such  that 


J  =  U  (1,  . . . ,NM(k)>  x  {k> 
k  €  K  M 


(4.5) 


and 


Z  =  U  z  ( k )  • 
k  €  K 


(4.6) 


For  each  subset  Z  of  Z,  we  can  define  an  information  state  I(Z) 
consisting  of  the  set  of  all  possible  tracks,  the  set  of  all  possible 
hypotheses,  the  posterior  probability  of  each  hypothesis,  the  state 
distribution/density  of  each  track,  and  the  expected  number  of 
undetected  targets,  i.e., 

I(Z)  =  {T(Z),W(Z),(P(A|Z),A  €  H( Z)),(p(x(t)|z,T),T  €  T( Z)),v(Z) 

(4.7) 

where  p(x(t)|Z,T)  and  v(Z)  are  as  defined  in  Section  3.  I(Z)  is  called 
the  information  state  because  it  summarizes  all  the  information  about 
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the  targets  contained  in  Z  and  is  used  in  the  recursion  in  the  GTC . 

4.1.3  Information  Fusion  Problem 

The  nodes  are  assumed  to  communicate  in  a  broadcast  mode.  If  the 

actual  data  sets  were  communicated,  after  one  such  communication  time, 

all  the  agents  would  have  the  same  information  Z  which  is  a  subset  of  Z. 

Let  X(Z)  be  the  information  state  of  Z.  Until  the  next  communication 

time,  each  node  n  receives  data  sets  from  the  sensors  in  S  so  that  its 

n 

information  increases.  The  increasing  information  can  be  processed 
using  the  GTC.  Let  Z^  be  the  information  of  node  n  just  before  the  next 
communication,  and  KZ^)  be  the  information  state.  If  again  the  actual 
data  sets  were  communicated,  the  information  of  each  node  would  change 
to 

Z  -  U  Z  (4.8) 

n  €  N  n 

with  information  state  I(Z).  Our  assumption,  however,  is  that  the  nodes 

only  communicate  their  information  states.  Thus  the  problem  is  how  to 

recover  the  information  state  I(Z)  from  (I(Z  ))  _  „  and  I(Z).  Figure 

n  n  c  N 

4-1  illustrates  the  structure  of  the  problem. 

There  are  two  parts  to  this  problem: 

•  Hypothesis  Formation:  Given  I(Z)  and  (I(Zn))n  £  N>  bow  should  all 


the  possible  tracks  and  hypotheses  on  Z  be  formed? 

•  Hypothesis  Evaluation:  Given  I(Z),  (I(Zn)>n  ^  N>  and  all  the  possi¬ 
ble  tracks  and  hypotheses  on  Z,  how  should  the  probabilities  of  the 
hypotheses  and  the  state  distributions  of  the  tracks  be  computed  to 
complete  the  description  of  I(Z)? 

The  information  fusion  module  consists  of  three  submodules  which 
carry  out  these  two  functions  of  hypothesis  formation  and  hypothesis 
evaluation  and  the  additional  function  of  hypothesis  management.  These 
will  be  discussed  separately. 

4.2  HYPOTHESIS  FORMATION 

The  objective  of  this  submodule  is  to  generate  all  the  possible 
hypotheses  and  tracks  from  the  local  hypotheses  and  tracks. 

4.2.1  Example 

The  following  example  (Figure  4-2)  shows  that  one  has  to  be  careful 
in  forming  the  global  hypotheses  from  the  local  hypotheses. 

Consider  two  nodes  each  with  one  sensor.  Node  1  has  sensor  1,  and 
node  2  has  sensor  2.  Sensor  1  (s*l)  generates  a  data  set  with  only  one 
measurement  at  time  1  (t=l).  This  measurement  can  be  indexed  by  (j,t,s) 
where  j*l  indexes  the  only  measurement  within  the  measurement  set.  Node 
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Figure  4-2:  Local  and  Global  Hypotheses  -  an  Example 


1  forms  two  hypotheses  and  broadcasts  to  node  2.  Thus 


J  =  {(1,1, 1)>.  (4.9) 

At  time  t=2,  each  sensor  generates  another  data  set  with  one  measurement 
each.  These  measurements  can  be  indexed  by  (1,2,1)  and  (1,2,2)  respec¬ 
tively.  The  new  cumulative  measurement  index  sets  at  the  two  nodes  are 
then  for  n  =  1 ,  2, 

Jn  =  {(l,2,n)}  U  J  -  {(1, 2, n), (1,1,1)).  (4.10) 

Local  hypotheses  are  formed  for  each  node.  The  global  cumulative  meas¬ 
urement  index  set  is 

J  =  J1 u  J2  (4.11) 

-  {(1,1,1), (1,2,1), (1,2, 2)}. 

Figure  4-2  shows  the  five  local  hypotheses  defined  on  J^.n  «=  1,2,  and 
the  fifteen  (15)  global  hypotheses  defined  on  J. 

Note  that  if  we  compose  the  local  hypotheses  without  any  restric¬ 
tion,  there  would  be  twenty-five  (25)  hypotheses.  This  is  in  excess  of 
the  actual  number  of  global  hypotheses  if  all  the  data  were  processed  in 
a  centralized  manner.  Thus  some  of  the  compositions  are  inadmissible. 

On  close  examination,  we  discover  that  some  compositions  are  incon¬ 
sistent.  For  example,  Aj*^  from  node  1  implies  that  (1,1,1)  is  a  false 
(2) 

alarm  and  A^  from  node  2  implies  that  (1,1,1)  is  a  target.  These  two 
obviously  conflict  and  cannot  be  composed  to  form  a  global  hypothesis. 

We  also  note  that  these  two  local  hypotheses  are  expanded  from  different 


X's,  i.e.,  they  have  different  predecessors.  This  suggests  that  we 
should  only  compose  pairs  of  hypotheses  with  common  predecessors. 

If  we  do  that,  the  number  of  possible  compositions  is  13,  which  is 
less  than  the  number  of  global  hypotheses.  This  means  that  some  compo¬ 
sitions  of  local  hypotheses  should  produce  more  than  one  global 
hypothesis.  Table  4-1  shows  how  this  happens.  On  the  horizontal  axis, 
we  have  local  hypotheses  for  node  2;  on  the  vertical  axis,  we  have  local 
hypotheses  for  node  1.  Each  entry  in  the  matrix  contains  the  global 
hypotheses  composed  from  the  pair  of  local  hypotheses .  An  empty  entry 
indicates  an  impossible  composition.  We  note  that  two  of  the  composi¬ 
tions  yield  more  than  one  global  hypothesis  each.  For  example,  both 
and  xjj^  have  one  track  each.  These  two  tracks  can  correspond  to 
the  same  target  or  two  different  targets,  thus  resulting  in  two  global 
hypotheses . 

Table  4-1  Hypothesis  Composition 
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.2.2  Hypothesis  Formation  Procedure 

We  have  the  following  two  level  procedure  for  hypothesis  formation: 
1.  Hypothesis-to-hypothesis  composition: 

a.  For  each  identify  the  predecessor  hypothesis  A  =  A^lJ,  the 

restriction  of  A  to  the  past  common  cumulative  measurement 
n 

index  set  J. 

b.  Exhaust  all  possible  compositions  of  local  hypotheses  with  the 
same  predecessor. 


2.  Track-to-track  composition:  For  each  collection  of  local 
hypotheses  which  can  be  composed. 


a.  Construct  a  unique  extension  T  for  every  T  in  A  by  letting 

T  *  U  x  where  X  is  the  unique  extension  in  A  of  T.  Let 
n  €  N  n  n  n 

AqLd  be  the  set  of  such  tracks . 


b.  Exhaust  all  possible  compositions  of  tracks  which  are  not 
extensions  of  tracks  in  A.  Let  AN£W  be  one  such  set.  Then 
any  hypothesis  composed  from  a  given  local  hypothesis  composi¬ 


tion  is  of  the  form  *0LDU 


For  the  example  in  Figure  4-2,  the  operations  of  Level  1  generate 
altogether  13  possible  hypothesis-to-hypothesis  compositions  with  4  from 
the  predecessor  AQ  and  9  from  the  predecessor  A^.  Except  for  the  compo¬ 
sitions  of  a|2^)  and  (A^\  A^2^),  A^g^  is  unique,  and  thus  there 

is  only  a  single  global  hypothesis  for  each  composition.  For  the  pair 
(a![*\  A^2^),  Aqld  =  <4>>  and  there  are  two  ^EW  S*  resu^t^n8  i-n  two  gl°~ 
bal  hypotheses.  For  the  pair  A^  and  A^2^,  AQLD  =  {(1,1,1)}  and  there 
are  again  two  ^jry"s»  resulting  in  two  global  hypotheses. 

4.3  HYPOTHESIS  EVALUATION 

Given  the  global  hypotheses  and  global  tracks  constructed  from  the 
local  hypotheses  and  tracks,  the  objective  of  the  hypothesis  evaluation 
submodule  is  to  compute  their  probabilities  and  state  distributions 
using  the  communicated  local  information.  Specifically,  we  need  to 
evaluate  p(AlZ)  for  all  global  hypotheses,  p(x(t)|Z,i)  for  all  global 
tracks  and  \>(k),  the  expected  number  of  undetected  targets. 

4.3.1  Static  Target  Models 

We  first  state  the  hypothesis  evaluation  algorithm  for  static  tar¬ 
get  models.  Since  deterministic  random  process  models  can  be  reduced  to 
static  models,  this  algorithm  is  also  useful  when  the  targets  can  be 
approximated  by  deterministic  random  processes,  e.g.,  when  the  driving 
noise  in  a  linear  stochastic  system  is  very  small.  The  following 
results  are  derived  in  Appendix  C. 


Fusion  Algorithm 


For  stationary  targets  and  broadcast  communication,  we  have  for  every 
A  €  H(J) , 


p(xiz)  =  C“1P(X|Z)-(#N  1}(  n  p(x  iz  ))  n  jl. 

n  €  N  n  n  x€X  T 


where  C  is  a  normalization  constant,  #N  is  the  number  of  elements  in  N 


/II  p(x|Zn,x) 

— - — - Jjj-j-  M(dx). 


The  expected  number  of  targets  undetected  up  to  K  is: 


IL2J 


n  p(x|z  ,$) 

n  €  N 


(p(x|z,4>)) 


p(x|Zn,T)  and  p(x|Z,T)  are  given  by 


p(x|z  ,t) 


P  (x  i  Z_ » f  ) 


v(K  )p(x|Z_,4>) 


P(X|Z,T)  = 


P(xjZ,T) 


v(K)p(x|Z,<j>) 


if  T  +  * 


if  T  =  <t> 


(4.16) 


p(x|Zn,x)  and  p(x|Z,x)  are  the  state  distributions  at  the  time  of  fusion 

conditioned  by  track  X  Z  and  Z.  Furthermore,  the  state  distributions 

n 

can  be  fused  to  obtain 


n  p(x|z  ,x) 

.  |  .  n€N _  _-l 

p(x,Z,x)  #N-1  C 

(p(x | Z,T) ) 


(4.17) 


where  C  is  a  normalization  constant. 


Similar  to  hypothesis  formation,  this  fusion  formula  (4.12)  has 
again  a  two  level  structure.  At  the  high  level,  we  consider  the  proba¬ 
bilities  of  the  local  hypotheses.  The  probability  of  the  global 
hypothesis  is  a  product  of  the  probabilities  of  the  local  hypotheses. 
However,  because  each  of  the  local  probabilities  has  been  computed  using 
the  prior  probability  of  the  predecessor  hypothesis,  the  product  has  to 
be  divided  by  the  (#N-l)-th  power  of  the  probability  of  the  common 
predecessor  hypothesis  to  prevent  any  double  counting  of  this  probabil¬ 
ity.  This  elimination  is  quite  standard  in  distributed  estimation  prob¬ 
lems.  When  the  information  from  multiple  dependent  sources  is  to  be 
combined  or  fused,  the  redundant  information  has  to  be  removed.  A  more 
complete  discussion  of  this  will  be  found  in  the  next  section  and  in 
Appendix  B. 
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Given  each  hypothesis-to-hypothesis  composition,  the  lower  level 
considers  the  likelihood  of  each  track.  Specifically,  evaluates  the 
likelihood  that  the  multiple  local  tracks  correspond  to  a  single  global 
track.  £  is  related  to  an  integral  of  the  product  of  the  state  distri¬ 
butions  of  the  local  tracks.  Thus  if  the  local  tracks  have  very  similar 
statistics,  e.g.,  position  means  and  variances,  the  product  will  be  a 
non-zero  function  and  the  integral  will  be  large,  resulting  in  a  high 
likelihood  for  the  track.  On  the  other  hand,  if  the  state  distributions 
of  the  local  tracks  are  very  different,  they  will  have  little  overlap 
and  the  integral  will  be  small  or  zero.  In  this  case,  the  likelihood 
that  the  two  local  tracks  correspond  to  the  same  target  will  be  small. 
The  division  by  the  state  distributions  of  the  common  prior  tracks  is 
again  to  prevent  double  counting  of  any  common  information. 

The  updating  of  the  expected  number  of  undetected  targets  and  the 
state  distributions  of  the  global  tracks  have  similar  equations.  In  the 
case  of  Gaussian  distributions,  the  computation  only  involves  means  and 
covariances . 


4.3.2  Dynamic  Target  Models 

If  the  targets  are  dynamic,  as  in  more  realistic  situations,  the 
hypothesis  evaluation  formula  has  the  same  form  as  in  the  static  case. 
However,  the  track  likelihood  functions  must  now  be  computed  dif¬ 
ferently.  All  the  conditional  probabilities  of  x  must  now  be  replaced 
by  those  of  x^,  where  x^  is  the  target  state  evaluated  at  the  set 
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Tj  =  <t  I  (t , s )  €  K-K>, 


(4.18) 


i 

Che  times  when  sensor  observations  are  made  since  the  last  communication  .J 
time.  In  other  words,  to  decide  whether  two  local  tracks  could  have  V< 
come  from  the  same  target,  we  consider  the  probability  distributions  of  — 
the  state  for  these  two  tracks  over  the  entire  time  interval. 


For  a  special  class  of  random  processes,  called  deterministic 
processes,  where  the  state  at  any  time  uniquely  determines  the 
processes,  the  track  likelihoods  depend  only  on  the  state  distributions 
of  the  tracks  at  min(Tj).  Figure  4-3  illustrates  how  the  likelihoods 
are  computed  for  the  three  different  processes.  For  both  static  and 
deterministic  random  processes  the  densities  of  the  target  states  at  a 
single  time  are  needed  in  track  likelihood  computation.  For  a  general 
random  process,  the  densities  of  the  states  over  one  time  interval  are 
needed.  This,  of  course,  makes  hypothesis  evaluation  more  difficult. 

In  many  situations,  however,  such  processes  can  be  approximated  by 
deterministic  random  processes.  This  is  the  case  when  the  noise  driving 
the  linear  system  which  models  the  target  motion  is  very  small. 


4.4  HYPOTHESIS  MANAGEMENT 

Hypothesis  management  is  again  needed  to  keep  the  number  of 
hypotheses  manageable  within  the  computational  resources  of  each  node. 
The  same  hypothesis  management  techniques  discussed  in  Section  3.3  are 
again  applicable. 
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S.  INFORMATION  DISTRIBUTION  AND  PROBLEMS  IN  DISTRIBUTED  ESTIMATION 


The  information  distribution  module  in  each  node  distributes  the 
local  information  to  the  other  nodes  in  the  network.  It  thus  determines 
the  following: 

-  when  to  send  a  message 

-  where  to  send  the  message 

-  what  to  include  in  the  message 

The  three  items  above  together  constitute  the  information  structure 
of  the  system.  The  design  of  information  structures  is  one  of  the  most 
difficult  problems  related  to  the  DSN  since  it  is  not  amenable  to  ana¬ 
lytic  studies.  However,  an  efficient  design  can  serve  the  objective  of 
getting  the  needed  information  to  the  right  node  at  the  right  time 
without  using  too  much  communication  resources.  In  this  project,  we 
have  studied  these  issues  through  simulation  experiments. 

In  Section  4,  we  have  discussed  the  information  fusion  problem  for 
distributed  multitarget  tracking  assuming  a  broadcast  type  communica¬ 
tion.  A  key  feature  of  the  information  fusion  algorithm  is  that  any 
redundant  information  contained  in  dependent  data  sets  has  to  be 
removed.  If  the  communication  pattern  is  different  from  the  broadcast 
type,  information  fusion  becomes  more  complicated.  Since  not  much  has 
been  done  in  this  area,  our  discussion  will  be  restricted  to  general 
estimation  problems  rather  than  multitarget  tracking.  A  more  detailed 


description  can  be  found  in  Appendix  B  and  [4]. 

5.1  WHAT  TO  DISTRIBUTE 

We  adopt  the  philosophy  that  each  node,  upon  receiving  messages 
from  other  nodes,  attempts  to  reconstruct  the  best  assessment  of  the 
state  of  the  world  as  if  the  actual  sensor  reports  were  communicated. 
Thus  the  messages  should  represent  the  information  state  at  each  node. 

In  the  context  of  multitarget  tracking  and  classification,  this  informa¬ 
tion  state  corresponds  to  the  set  of  hypotheses,  the  set  of  tracks,  the 
probabilities  of  the  hypotheses,  the  state  distributions  of  the  tracks 
and  the  expected  number  of  undetected  targets.  When  all  of  these  are 
communicated,  as  discussed  in  the  previous  section,  each  node  is  then 
able  to  reconstruct  the  global  hypotheses  and  their  probabilities. 

If  communication  constraints  are  present,  the  information  to  be 
distributed  to  the  other  nodes  may  have  to  be  reduced.  In  this  case, 
only  a  subset  of  the  hypotheses  may  be  distributed.  The  information 
fusion  algorithm  described  in  the  previous  section  will  no  longer  be 
optimal.  The  question  is  then:  how  many  hypotheses  ought  to  be  kept. 
This  problem  is  similar  to  the  hypothesis  management  problem  discussed 
in  Section  3.  Again,  no  general  theory  is  available.  Rather,  communi¬ 
cation  is  dictated  by  practical  considerations  such  as  the  bandwidth  of 
the  network.  Frequently,  only  the  best  hypothesis  from  each  node  can  be 


A  more  adaptive  strategy  is  to  vary  the  communication  based  on  the 
information  present  in  the  set  of  hypotheses.  For  example,  one  adaptive 


strategy  is  to  transmit  a  sufficient  number  of  hypotheses  until  the 
cumulative  probability  exceeds  a  certain  prescribed  value.  This  is 
similar  to  the  adaptive  pruning  strategy  described  in  Section  3.5.1.  If 
one  hypothesis  stands  out  as  being  highly  probable,  then  only  that 
hypothesis  will  be  distributed.  On  the  other  hand,  if  several 

m 

hypotheses  are  equally  probable,  then  they  should  all  be  distributed. 

In  some  situations,  it  may  be  desirable  to  go  one  step  further.  The 
information  distributed  should  not  only  be  decided  by  the  transmitting 

m 

g 

node,  but  depend  on  requests  from  potential  receivers.  Thus  a  node 
which  is  highly  confused  may  request  a  lot  of  information  from  other 
nodes  to  help  to  disambiguate  the  situation.  Although  we  have  not 
implemented  these  adaptive  strategies  in  our  current  system,  they  should 
be  included  in  an  improved  version. 

p 

5.2  ISSUES  RELATED  TO  DISTRIBUTED  ESTIMATION  NETWORKS 

A  DSN  is  a  special  case  of  a  general  distributed  estimation  net¬ 
work.  A  main  advantage  of  such  systems  is  that  there  is  no  single  cen¬ 
tral  node  whose  failure  or  destruction  may  disable  the  entire  system. 

For  such  a  distributed  system  to  be  really  effective,  the  communication 
network  supporting  the  nodes  should  have  a  certain  amount  of  redundancy. 

Otherwise,  some  nodes  may  be  isolated  from  others  in  a  failure.  A 
redundant  network,  however,  means  that  the  messages  arriving  at  a  node 
by  different  paths  may  contain  the  same  information.  If  this  redundant 


i 
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information  is  not  removed  in  the  processing,  the  same  information  may 
be  used  multiple  times,  resulting  in  an  incorrect  assessment  of  the 
state  of  the  world. 

Appendix  B  contains  a  theory  for  handling  the  distributed  estima¬ 
tion  problem  for  arbitrary  network  structures.  A  distributed  fusion 
formula  for  combining  the  local  conditional  probabilities  of  the  state 
of  a  system  to  obtain  the  optimal  global  conditional  probability  is 
given.  This  theory,  together  with  the  theory  of  multitarget  tracking 
described  in  Section  3,  is  then  used  to  develop  a  theory  for  distributed 
multitarget  tracking  and  classification.  In  the  following,  we  outline 
this  theory  and  illustrate  the  importance  of  proper  processing  with  an 
example . 


5.2.1  Basic  Results 

Let  x  be  the  state  or  random  variable  to  be  estimated.  Let  S  be  a 
set  of  sensors.  At  a  given  time  t  in  T,  a  sensor  s  generates  an  output 
or  measurement  z(t,s).  Let  Z.  be  the  set  of  all  such  measurements. 

Assume  that  the  measurements  are  all  conditionally  independent  given  the 
state  x. 

Let  N  be  the  set  of  processing  nodes  or  estimation  agents,  each 
receiving  the  reports  from  a  subset  of  S.  The  information  available  to 
an  agent  at  any  time  is  a  subset  of  Z.  We  assume  that  each  agent  n  at 
time  t  computes  the  conditional  probability  of  x  given  the  available 
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information  denoted  by  Z(t,n).  Agents  also  communicate  the  conditional 
probabilities  to  other  agents.  When  a  conditional  probability  is 
received  by  an  agent,  it  fuses  this  with  its  local  conditional  probabil¬ 
ity  to  obtain  the  conditional  probability  which  would  have  resulted  if 
the  actual  measurements,  instead  of  the  conditional  probabilities,  had 
been  communicated  through  the  network. 


To  carry  out  this  fusion  properly,  the  redundant  information  con¬ 
tained  in  the  conditional  probabilities  has  to  be  removed.  The  follow¬ 
ing  lemma  contains  the  basic  results. 


Lemma:  Let  Z^,  Z 2  be  two  subsets  of  Z.  Then 

p(x|Z1)p(x|Z2) 
p(x|z1  U  z2)  =  C  p(x|Zi  n 


(5.1) 


where  C  is  a  normalization  constant. 


This  lemma  can  be  regarded  as  a  distributed  version  of  Baye's  rule  and 
is  crucial  in  distributed  estimation  problems.  Its  proof  can  be  found 
in  Appendix  B.  The  set  Z^  U  Z2  is  the  joint  information  in  Zj  and  Z2 
while  the  set  Z^ O  Z2  is  the  common  information.  In  combining  the  con¬ 
ditional  probabilities  p<x I Z^ )  and  p(x|Z2>  to  obtain  pCxIZ^U  Z2>,  we 
note  that  both  of  these  local  probabilities  have  used  the  common  infor¬ 
mation.  Thus,  if  we  combine  these  probabilities  with  a  naive  rule, 
e.g.,  by  a  product  rule,  the  common  information  represented  by 
p(x|Zj  n  z2)  would  have  been  used  twice.  The  redundancy  can  be  removed 
by  dividing  with  p(x|Z^  O  Z2).  In  other  words,  the  lemma  states  that 
the  probabilities  (p(xlZ^) ,p(x|Z2))  do  not  contain  enough  information  to 
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recover  p(x|Z^  U  and  that  an  additional  probability  p(x|Z^  O  Z2)  is 
also  needed. 

This  lemma  can  be  used  as  a  basis  for  developing  optimal  fusion 
formulas  for  arbitrary  information  structures.  In  a  general  case,  each 
node  would  have  to  know  the  history  of  the  messages  and  remember  some 
statistics  from  the  past. 

5.2.2  Example 

The  following  simplified  example  illustrates  the  removal  of  redun¬ 
dant  information  in  an  arbitrary  network.  Consider  a  network  with  three 
nodes  {1,  2,  3}.  Node  i  gets  the  measurements  from  sensor  i.  The  state 
x  is  a  discrete  random  variable  with  three  possible  values  {a,  b,  c>. 

The  a  priori  probability  of  x  is  uniform,  i.e., 

p(x=a)  -  p(x-b)  “  p(x»c)  =  j  (5.2) 

The  sensor  measurement  z(t,s)  also  takes  values  in  {a,  b,  c),  and  the 
conditional  probabilities  of  the  measurements  for  all  the  sensors  are 
given  in  Table  5-1 . 


Table  5-1  Sensor  Characteristics 


p(z(t,l)|x)  =  p(z(t,3)|x) 


p(z(t,2) |x) 


At  each  time  t  =  0,  1,...,  each  node  processes  the  sensor  data  and 
computes  the  conditional  probabilities  of  x  given  all  the  available 
information.  At  a  time  s  ■  t  +  d,  where  d  is  a  small  time  interval, 
each  node  sends  the  conditional  probability  to  its  immediate  neighbor 
according  to  the  graph  of  Figure  5-1.  The  communication  is  cyclic  and 
counter-clockwise.  Note  that  the  information  sent  from  one  node  eventu¬ 
ally  returns  to  the  same  node.  Upon  receiving  the  conditional  probabil¬ 
ities,  each  node  combines  them  to  improve  on  the  local  estimate.  Three 
algorithms  for  information  fusion  are  considered. 


•  Optimal  Algorithm:  Let  Z(t,i)  be  the  data  available  to  a  node  i  at 
time  t  if  the  actual  measurements  were  communicated  through  the  net¬ 
work  instead  of  the  conditional  probabilities.  Then  the  optimal 
fusion  algorithm  is 
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Figure  5-1:  Communication  Structures 
where  C  is  a  normalization  constant  and  [i]  is  i  modulo  3.  In  this 
algorithm,  in  addition  to  the  local  conditional  probability 
p(x|Z(t,i))  and  p(x|Z(t, ti+1]) )  fro©  the  neighbor,  each  node  has  to 
remember  some  of  the  earlier  probabilities  in  order  to  remove  the 
redundant  information.  In  addition,  the  history  of  the  message  i6 
also  needed. 

•  Heuristic  Algorithm  1:  In  this  algorithm,  we  assume  that  the  condi¬ 
tional  probabilities  from  the  nodes  do  not  contain  any  redundant 
information.  Thus  the  fusion  algorithm  is  given  as  a  product 

p(x|Z(s,i))  555  C  p(x|Z(t,i))p(x|Z(t,  li+1]))  (5.4) 

where  C  is  again  a  normalization  constant. 

•  Heuristic  Algorithm  2:  Heuristic  algorithm  2  is  similar  to  1  in 
that  it  assumes  to  redundant  information.  However,  it  includes 
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explicitly  the  suboptimal  nature  of  the  algorithm  and  tries  to 
remain  more  objective.  This  is  accomplished  by  having 

p(x|Z(s, i))  «  C  lp(x|Z(t,i))p(x|Z(t, [i+1] ))]0*^  (5.5) 

Since  the  square  root  operation  has  the  effect  of  flattening  out  a 
probability  distribution,  this  algorithm  is  less  likely  to  have  com¬ 
plete  confidence  in  any  particular  conclusion  and  is  more  willing  to 
incorporate  new  information.  Thus  it  may  be  viewed  as  a  hedging 
strategy. 


The  results  of  the  simulation  studies  are  shown  in  Table  5-2.  The 
optimal  algorithm  and  heuristic  algorithm  2  both  converge  to  the  true 
value  of  x,  although  the  convergence  of  heuristic  algorithm  2  is  slower. 
Heuristic  algorithm  1  converges  very  fast  but  frequently  to  the  wrong 
value  of  x.  In  terms  of  memory  requirements,  the  two  heuristic  algo¬ 
rithms  are  quite  similar  while  the  optimal  algorithm  requires  more 
memory . 


Table  5-2  Simulation  Results 


Algor ithm 

1  Percentage 

1  of  Correct 

1 

1 

Memory 

1 

1 

Convergence 

1  Classifications 

1 

Requirements 

1 

Speed 

Optimal 

1 

1  100 

1 

1 

High 

1 

1 

Fast 

Heuristic  1 

1 

1  60 

1 

1 

Low 

1 

1 

Very  Fast 

Heuristic  2 

1 

1  100 

1 

1 

1 

1 

Low 

1 

1 

1 

Medium 
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Figure  5-2  shows  a  typical  simulation  run  where  heuristic  algorithm 
1  converges  to  the  wrong  value  b  while  the  true  state  is  a.  The  condi¬ 
tional  probabilities  at  each  node  are  displayed  for  the  states  x  -  a  and 
x  =  b.  We  can  explain  the  behavior  of  heuristic  algorithm  1  as  follows. 
At  time  t  =  1,  sensor  2  gets  an  erroneous  measurement  b.  Since  sensor  2 
is  assumed  to  be  quite  accurate,  the  local  conditional  probability  is 
biased  towards  x  e  b.  Node  1  incorporates  this  into  its  local  probabil¬ 
ity  at  time  2,  resulting  in  a  high  value  for  the  conditional  probability 
of  x  =  b.  This  bias  is  propagated  to  node  3  at  time  3.  In  the  mean¬ 
time,  node  2  has  obtained  some  correct  measurements.  However,  the  com¬ 
munication  from  node  3  arrives  and  increases  its  confidence  in  x  =  b. 
From  then  on,  the  process  gets  out  of  control  and  the  conditional  proba¬ 
bilities  all  converge  rapidly  to  x  ■  b. 

If  we  observe  the  behavior  of  heuristic  algorithm  2,  we  notice  that 
it  never  concludes  that  the  state  is  any  value  with  probability  1.  With 
this  kind  of  hedging,  the  algorithm  is  more  likely  to  recover  after  an 
error  has  been  made. 


6 .  NUMERICAL  EXAMPLES 


6.1  INTRODUCTION 

In  this  section,  two  numerical  examples  are  described  to  illustrate 
the  performance  of  the  distributed  Generalized  Tracker/Classifier  (GTC) 
and  to  compare  its  performance  with  those  of  alternative  structures. 

For  both  examples,  simple  target  dynamics  were  chosen;  in  fact,  the 
first  example  assumes  stationary  targets.  Simple  dynamics  were  chosen 
for  the  following  reasons.  First,  target  dynamics  affects  mostly  the 
filtering  problem,  which,  although  is  always  at  the  bottom  of  the  GTC  or 
distributed  GTC  algorithm,  is  not  the  main  focus  of  our  research. 

Second,  if  complex  dynamics  are  to  be  used,  many  numerical  approxima¬ 
tions,  sometimes  very  bold  ones,  must  be  employed,  making  it  difficult 
to  single  out  the  major  factors  affecting  the  performance.  Third, 
although  as  discussed  in  earlier  sections,  general  target  dynamics  can 
be  treated  at  least  in  principle,  meaningful  performance  analysis  for 
complicated  dynamics  requires  further  improvements  in  the  efficiency  of 
the  GTC  and  distributed  GTC  computer  codes.  For  the  same  reasons,  very 
simple  sensor  models  were  chosen. 

The  underlying  basic  assumptions  are: 

1.  Targets  are  distributed  along  a  straight  line  (e.g.,  on  a  road  or 


in  an  air  corridor). 

2.  Targets  are  either  stationary  (the  first  example)  or  moving  at 
almost  constant  velocities  (the  second  example). 

3.  Sensors  measure  position  and  velocity  along  the  road.  These  models 
roughly  approximate  range-range  rate  radars  whose  angles  are  very 
narrow,  and  which  point  to  the  line  segment  at  very  shallow  angles, 
so  that  the  range  readings  can  be  regarded  as  a  linear  function  of 
the  1-dimensional  position  corrupted  by  some  noise. 

4.  In  the  first  example,  it  is  assumed  that  either  the  target  speeds 
are  very  small  compared  with  sensor  revisit  times  or  sensors  are 
tuned  to  detect  stationary  targets. 

5.  In  the  second  example,  it  is  also  assumed  that  the  sensor  readings 
include  noise-corrupted  linear  measurements  of  the  target  positions 
and  velocities. 

Furthermore,  we  also  assume  the  independent  and  identically  distributed 
target  models  for  which  the  current  version  of  GTC  and  distributed-GTC 
have  been  designed. 


6.2  STATIONARY  TARGET  EXAMPLE 


We  first  consider  the  case  of  stationary  targets.  This  can  be  used 
to  approximate  targets  whose  movements  are  small  within  the  observation 
interval. 


6.2.1  Example  Scenario 

In  this  example,  each  individual  target  is  represented  by  a  one¬ 
dimensional  position  and  is  assumed  to  be  stationary.  Therefore,  the 
state  of  each  target  is  a  real  number.  The  f ield-of-view  of  each  sensor 
is  assumed  to  be  identical  and  to  be  the  line  segment  [0,L],  Each  sen¬ 
sor  creates  a  measurement  of  the  form 
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y  =  x  +  noise 


(6.1) 


|  when  it  detects  a  target  at  x.  The  noiBe  is  an  independent  Gaussian 

random  variable  with  standard  variation  a  which  is  small  compared  to  L. 
The  probability  of  detecting  a  target  at  x  is  given  by 


PD(X) 


D 

D  max 

V~2n’o 


(6.2) 


if  x  £  l-3o,L+3o]  and  0  otherwise.  The  probability  density  function 
p  (.|x)  of  a  measurement  y  given  that  it  originates  from  a  detected  tar- 
I  get  x  is  given  by: 


I 


Pm(y|x) 


(6.3) 


■  if  y  €  [0,  L]  and  zero  otherwise.  It  is  assumed  that  all  the  sensors 

are  modeled  identically. 


The  target  positions  are  independent  and  identically  distributed 
and  the  common  distribution  is  uniform  on  ['3c,  L+3c],  The  total  number 
of  targets  is  constant  but  random  and  unknown;  its  a  priori  distribution 
is  Poisson  with  mean  v^..  The  target  density  is  then 


$T  =  vT/(L+6o) 


(6.4) 
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The  number  of  false  alarms  in  each  scan  is  also  Poisson  with  mean 

FA 

for  each  sensor.  The  adaptive-breadth/adaptive-thresholding  pruning 
technique  described  in  Section  3  i6  employed  for  the  GTC  and  the  distri¬ 
buted  GTC.  The  threshold  level  is  represented  by  e  €  (0,  1). 


6.2.2  Communication  Schemes  and  Performance  Measures 

Three  communication  schemes  are  examined: 

•  [Scheme  1]  decentralized  -  single  sensor  (13  scans), 

•  [Scheme  2]  centralized  -  two  sensors  (15  +  15  scans), 

•  [Scheme  3]  distributed  -  two  single-sensor  nodes  with 

communication/ inf ormation 
fusion  every  5  scans. 

In  the  distributed  scheme,  all  the  hypotheses  at  each  node  are  communi¬ 
cated.  The  centralized  scheme  can  be  regarded  as  the  case  when  the  sen¬ 
sor  data  are  communicated  every  scan.  Since  we  assume  identical  sensors 
with  identical  f ields-of-view  and  with  the  same  performance,  the  differ¬ 
ence  between  Scheme  1  and  Scheme  2  is  only  in  the  number  of  sensors  or 
equivalently  the  number  of  scans  in  a  centralized  GTC  system. 


For  each  scheme  the  baseline  parameters  are  given  in  Table  6-1. 


Pruning  Threshold 


e 


.05 


Target  Density 

6t 

5/L 

Measurement  Error 

O/L 

.001 

Expected  Number  of  FA 

VFA 

5 /scan 

Probability  of  Detection 

P 

Dmax 

.7 

To  determine  the  sensitivity  of  the  schemes  to  parameter  variations  each 
parameter  in  the  above  table  was  varied  and  used  in  Monte  Carlo  simula¬ 
tions  which  examined  the  performance  of  each  scheme. 


I 


NAME 


SYMBOL 


MEASURE 


rectangular,  threshold  Munkres  algorithm  [13].  The  result  yields  a  set 


of  correctly  associated  pairs .  A  track  T  in  A  is  said  to  be  correctly 
associated  with  a  target  i  if  (i,  t)  is  a  correctly  associated  pair. 
Otherwise  it  is  declared  as  a  false  track.  A  target  i  is  said  to  be 
missed  if  there  is  no  T  in  A  such  that  (i,  T  )  is  a  correctly  associated 
pair.  Perfect  association  is  by  definition  the  case  when  there  is  no 
false  track  and  missed  target.  The  first  three  performance  indices, 

NFT*  NmT  and  PPA'  are  thus  calculated.  The  position  error  is  then  aver¬ 
aged  over  all  the  correctly  associated  pairs  to  yield  o  .  The  last  two 

'  P06 

indices  are  rough  measures  for  the  computational  requirements.  T  is 

£ 

measured  by  the  average  CPU  time  used  to  complete  one  entire  Monte  Carlo 


run.  N  is  the  average  number  of  tracks  processed  at  each  moment. 


6.2.3  Experimental  Results 

Because  of  the  stationary  targets  and  the  simple  sensor  model,  the 
filtering  required  is  very  simple  and,  in  most  cases,  reduces  to  one  of 
the  simplest  forms  of  Kalman  filtering.  Whenever  necessary,  such  as 
when  a  track  state  distribution  is  centered  at  the  edge  of  the  sensor's 
f ield-of-view  and  should  be  updated  as  a  missed  measurement,  an 
appropriate  quantization  approximation  is  used  to  perform  filtering  or 
to  calculate  the  track-measurement  likelihood  functions.  A  quantization 
approximation  is  also  used  to  calculate  the  distribution  of  undetected 
targets. 
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Table  6-3  Baseline  Quantitative  Comparison 


Decentralized  |  .62 


Average  I  Average  |  Posi- 

No.  of  I  Mo.  of  i  tion 

False  I  Missed  I  Esti- 

Tracks  f  Targets  |  mat  ion 

I  I  Error 


i  o 

.54 

1  1 

1  1 

1.19  I  2.69  1  7.32 

1  1 

!  .02 

.32 

1  1 

1  1 

.238  I  4.29  1  5.27 

1  1 

1  1 

1  .04 

1 

.24 

1  1 

1  1 

.403  I  5.95  1  6.01 

1  1 

Average 

Execu¬ 

tion 

Time 


Average 
No.  of 
Tracks 


(*)  Normalized  by  the  standard  deviation  (■  .001  x  field  of  view) 

(**)  Average  execution  time  for  one  run: 

Decentralized:  15  Data  Sets 

Centralized:  30  Data  Sets 

Distributed:  30  Data  Sets  +  3  Fusion  Operations 


Table  6-3  shows  the  results  of  a  50-run  Monte  Carlo  simulation. 
The  table  compares  the  performance  of  the  three  schemes  using  the  base¬ 
line  parameters  shown  in  Table  6-1.  For  each  scheme  we  have  assumed 
that  every  statistical  parameter  is  known  exactly  and  is  used  to  calcu¬ 
late  the  track-measurement  likelihood  functions.  As  far  as  the  first 
three  criteria  (which  represent  the  target  detection  capability  of  the 


GTC  or  the  distributed  GTC)  are  concerned,  the  performance  of  the  cen¬ 
tralized  (Scheme  2)  and  the  distributed  (Scheme  3)  schemes  are  almost 

identical.  Since  the  expected  number  of  true  targets  is  5,  both  N_,  and 

FA 

are  very  small.  Thus  both  schemes  perform  very  well  and  screen  out 
the  false  alarms,  five  of  which  appear  on  the  average  in  every  scan. 

This  similarity  in  performance  is  not  surprising  since  the  information 
fusion  algorithm  used  in  the  distributed  scheme  is  designed  so  that  the 
results  of  the  centralized  scheme  may  be  reconstructed  from  the  partial 
information  of  the  nodes.  The  performance  of  the  distributed  scheme  is, 
however,  slightly  better  than  that  of  the  centralized  case,  indicating 
the  advantage  of  distributed  calculation  when  hypothesis  pruning  is 
employed  (pruning  threshold  =  .02).  On  the  other  hand,  the  performance 
of  the  decentralized  scheme  (Scheme  1)  shows  how  poorly  the  system  per¬ 
forms  when  the  amount  of  data  is  half  that  of  the  other  schemes.  Table 
6-3  indicates  an  apparent  performance  degradation  due  to  the  the  rela¬ 
tively  small  amount  of  data.  However,  this  degradation  is  highly  depen¬ 
dent  on  the  quality  of  data  (false  alarm  rate),  the  pruning  threshold, 
etc.,  and  will  be  discussed  later. 


Because  of  the  relatively  low  probability  of  detection  *=  .7) 

and  the  hypothesis  combining  used  in  both  the  centralized  GTC  and  the 
distributed  GTC,  the  position  error  a  #  is  much  larger  than  that  from 
filtering  with  known  origins  of  all  the  measurements.  In  particular,  in 
the  decentralized  scheme  with  small  amount  of  data,  o  exceeds  even 

p08 

the  sensor  measurement  error  level.  The  difference  between  the  central¬ 


ized  and  distributed  schemes  seems  to  result  from  the  pruning  and  com- 
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bining  in  the  information  fusion  program  used  in  the  distributed  scheme. 
The  last  two  criteria,  and  NTR,  roughly  reflect  the  computational 
time  and  space  requirements  of  each  scheme.  The  number  of  hypotheses 
and  tracks  increases  rapidly  as  the  amount  of  data  increases.  This 
rapid  growth  is,  however,  controlled  by  the  pruning  and  combining  of 
hypotheses.  As  seen  in  Table  6-3,  the  resultant  increase  in  the  CPU 
time  is  about  60%  compared  with  100%  increase  of  data.  The  distributed 
scheme  uses  up  about  40%  more  CPU  time  than  the  centralized  scheme. 

When  we  consider  the  fact  that  the  distributed  scheme  maintains  two 
copies  of  data  and  processes  them  separately  and  that  a  relatively  com¬ 
plicated  fusion  program  is  used,  this  increase  seems  very  moderate. 

When  hypotheses  are  evaluated  successively  as  every  new  data  set 
arrives  (as  in  the  current  implementation),  the  uncertainty  in  the  ori- 
gin  of  each  measurement  is  resolved  rapidly  after  enough  data  is  accumu¬ 
lated.  With  less  data  available,  the  confusion  may  not  be  resolved  to 
the  end  of  one  run,  and  thus  many  hypotheses  (and  accordingly  many 
tracks)  must  be  stored.  The  difference  of  between  the  decentralized 
and  the  centralized  schemes  reflects  this  effect  correctly.  The  distri¬ 
buted  scheme  needs  to  store  slightly  more  tracks  than  the  centralized 
scheme.  However,  when  we  consider  the  fact  that  each  node  has  its  own 
copy  of  the  tracking  data  (significant  redundancy),  this  difference  is 
surprisingly  small.  The  qualitative  comparison  of  the  three  schemes  is 
summarized  in  Table  6-4.  It  should  be  noted,  however,  that  the  space 
requirement  comparison  does  not  include  the  size  of  the  program  itself. 


(*)  Excluding  the  program  space. 

The  same  program  is  used  both  for  the 
decentralized  and  the  centralized  schemes. 
The  distributed  scheme's  program  size  is  at 
least  twice  as  large. 


Figures  6-1  to  6-5  show  the  comparative  statistics  and  are  obtained 
by  varying  the  base  parameters  shown  in  Table  6-1.  Each  point  has  been 
created  by  a  50-run  Monte  Carlo  simulation.  For  each  run,  we  have 
assumed  that  all  the  a.  priori  statistics  are  exactly  known  and  the 
track-measurement  likelihood  functions  are  calculated  accordingly. 


6. 2. 3.1  Effect  of  Pruning  Thresholds 

The  first  set  of  curves  shows  the  effect  of  varying  the  pruning 
thresholds  e.  In  the  distributed  scheme,  the  same  threshold  is  used  for 
both  the  GTC  of  each  node  and  the  information  fusion  program.  The  first 
three  criteria  shown  in  Figure  6-1  indicate  that,  with  a  low  enough 
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pruning  threshold,  an  almost  perfect  performance  is  obtained  in  each  of 
the  three  schemes.  As  the  last  two  sets  of  curves  indicate,  however, 
the  cost  of  obtaining  this  performance  varies  from  scheme  to  scheme. 
Almost  over  the  entire  range  of  e,  T£  and  NTR  show  the  same  trend  as 
that  in  the  baseline  case  result.  Namely,  the  decentralized  scheme 
which  utilizes  only  half  the  amount  of  data  uses  the  least  amount  of  CPU 
time  but  the  largest  memory  space.  The  distributed  scheme  consumes  the 
largest  amount  of  time  while  its  overall  average  space  for  tracks  is 
almost  the  same  as  the  centralized  scheme  although  it  maintains  two 
copies  of  data  most  of  the  time. 

As  the  pruning  threshold  increases,  the  performance  becomes 
degraded  in  every  scheme.  The  deterioration  is  however  much  more  gra¬ 
dual  in  the  distributed  scheme  than  that  of  the  other  schemes.  From  the 
and  the  N^  curves  in  Figure  6-1,  it  is  observed  that  the  perfor¬ 
mance  of  all  three  schemes  is  biased  so  that  the  false-track  generation 
is  supressed  far  more  than  the  missed-target  generation.  This  trend  is 
obvious  in  the  decentralized  case  and  the  cases  with  higher  pruning 
thresholds  for  every  scheme.  This  bias  is  caused  by  the  fact  that  the 
exact  statistics,  particularly  the  expected  number  of  targets  and  false 
alarm  rates  are  used  to  calculate  the  likelihood  functions.  By  doing 
so,  each  scheme  tends  to  falsely  dismiss  targets  by  pruning.  This 
aspect  will  be  discussed  later.  The  points  of  convergence  as  the  thres¬ 
hold  approaches  zero  represent  the  performance  limit  of  the  given  sensor 
systems.  Although  a  slight  degradation  is  observed  as  the  pruning 
threshold  is  raised,  the  relation  of  the  position  error  statistics  of 


the  three  schemes  remains  almost  identical. 

6. 2. 3. 2  Expected  Humber  of  Targets 

Figure  6-2  shows  the  effect  of  the  varying  target  density.  Since 
the  f ield-of-view  is  fixed,  the  expected  number  of  targets  is  exactly 
proportional  to  the  density.  As  the  target  density  increases,  the  per¬ 
formance  generally  degrades  and  the  computational  requirements  increase. 
Performance  degradation  is  mo6t  likely  due  to  the  creation  of  false 
tracks.  These  tracks  may  be  created  by  wrong  combination  of  measurement 
points.  The  false  track  statistics  in  the  high  target  density  region 
are  particularly  high  in  the  distributed  scheme  although  its  overall 
performance,  PpR,  remains  superior  to  the  other  two  schemes.  This  type 
of  degradation  appears  probably  because  the  effect  of  pruning  in  the 
fusion  program  is  more  crucial  than  the  local  GTC  program  in  high  target 
density  cases.  Since  the  false  alarm  rate  is  kept  at  the  same  level, 
the  memory  requirement  represented  by  is  almost  proportional  to  the 
target  density.  The  required  CPU  time  increases  rapidly,  however,  in 
the  centralized  and  the  distributed  schemes.  It  is  again  noteworthy  to 
observe  that  the  distributed  case  performs  better  than  the  centralized 
scheme.  The  position  estimation  deteriorates  as  the  target  density 
increases.  This  is  a  result  of  frequent  combining  both  in  the  local  GTC 
and  the  information  fusion.  This  degradation  is  also  caused  by  the  fact 
that  some  of  the  tracks  in  the  be6t  hypotheses  are  entirely  or  partially 
formed  by  false  alarms  but,  because  of  high  target  density,  they  may  be 
considered  "good"  enough  tracks  of  actual  targets  which  "happen"  to  be 


in  their  neighborhoods. 


6. 2. 3. 3  Measurement  Error 

The  effect  of  varying  the  sensor  measurement  error  standard  devia¬ 
tion  o  is  shown  in  Figure  6-3.  In  general,  as  0  increases,  the  valida¬ 
tion  regions  tend  to  intersect  often  so  that  many  clusters  become  large 
in  size.  As  a  result,  there  may  be  a  number  of  equally  probable 
hypotheses  in  a  cluster.  As  indicated  by  the  curves,  the  hypothesis 
combining  keeps  the  number  of  hypothesis  under  control.  However,  in  the 
current  implementation  of  GTC  and  distributed  GTC,  all  the  track- 
,  measurement  combinations  are  first  exhausted,  then  hypotheses  are 

pruned,  and  finally  combined  after  updating  every  track.  Thus,  even 
though  the  resultant  number  of  tracks  is  reasonably  low  the  CPU  time 
requirement  increases  rapidly  in  accordance  with  0.  The  information 
fusion  program  seems  particularly  susceptible  to  this  kind  of  explosion 

of  CPU  time  requirement  as  indicated  by  a  jump  in  the  T  curve  at 

£» 

,  0/L  -  .005. 

This  seems  to  create  a  noticeable  performance  degradation  of  the 
distributed  scheme  for  large  0  when  it  seems  to  lose  its  definite 
superiority  over  the  other  schemes.  The  overall  detection  (tracking) 
performance,  however,  remains  relatively  unchanged.  It  is  hard  to  judge 
1  whether  some  variation  in  some  criteria,  particularly  N^,  reflects  some 

significant  performance  characteristics  or  they  are  merely  created  by 
random  effects.  This  is  so  because  the  sample  size  is  relatively  small 
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(50  runs).  On  the  other  hand,  the  position  estimation  does  not 
deteriorate  considerably  as  o  increases.  In  fact,  the  filtering 
improvement  of  estimates  is  better  when  a  is  large.  The  effect  of  the 


amount  of  data  is  again  obvious  in  the  a 
insignificant  when  a  is  large. 


pos 


curves  although  it  becomes 


6. 2. 3. 4  False  Alarm  Rates 

The  effect  of  varying  the  false  alarm  rates  is  shown  in  Figure  6-4. 
An  increase  in  the  false  alarm  rate  has  two  effects:  (1)  the  increase 
in  the  amount  of  data  to  be  processed,  and  (2)  an  increase  in  the  ten¬ 
dency  to  dismiss  tracks.  The  first  effect  forces  the  GTC  and  the  dis¬ 
tributed  GTC  to  discard  more  hypotheses  and  the  second  effect  results  in 
missed  targets,  as  seen  in  the  PpA  and  the  N^.  curves.  When  the  false 
alarm  rates  are  very  high,  there  may  be  many  occasions  where  they  align 
so  well  that  they  6eem  to  originate  from  real  targets.  However,  since 
the  false  alarms  are  independent,  such  events  may  be  still  rare.  Even 
if  they  are  not  rare,  the  GTC  with  non-zero  pruning  threshold  tends  to 
dismiss  many  measurements  as  false  alarms.  Thus,  the  performance  degra¬ 
dation  due  to  the  increase  in  the  false  alarm  rates  is  the  mo6t  apparent 
in  the  curves.  On  the  other  hand,  the  curves  behave  rather 
strangely.  This  seems  to  be  created  by  the  complicated  results  caused 
by  the  two  different  factors:  aligned  false  alarms  and  tendency  of 
dismissing  measurements.  When  the  false  alarm  rates  are  high,  all  three 
schemes  have  almost  identical  performance.  It  is  particularly  signifi¬ 
cant  to  observe  that  the  advantage  of  having  twice  the  amount  of  data 


disappears  in  such  cases.  This  is  interesting  because  ve  do  not  observe 

such  phenomena  in  conventional  filtering  or  estimation  problems.  In 

fact,  when  the  amount  of  data  is  doubled,  the  number  of  false  alarms  i6 

also  doubled  so  that  the  benefit  of  having  additional  data  diminishes. 

However,  the  advantage  of  the  large  amount  of  data  is  reflected  by  the 

<r  „  curve.  The  centralized  and  distributed  schemes  are  able  to  select 
pos 

data  more  accurately  without  excessive  combining  as  indicated  by  the 

o  curve.  In  the  decentralized  scheme  where  the  amount  of  data  is 
poa 

half,  there  is  a  tendency  to  dismiss  measurements  frequently,  as  is 
apparent  from  the  decreasing  curve. 


6. 2.3. 5  Probability  of  Detection 

The  last  parameter  varied  is  the  probability  of  detection,  p^^, 
in  Equation  (6.2).  The  result  is  shown  in  Figure  6-5.  As  to  be 
expected,  as  the  probability  of  detection  decreases,  the  performance 
deteriorates  gradually.  The  difference  in  the  behavior  of  the  three 
schemes  is  however  not  obvious.  In  particular,  there  is  no  significant 
difference  when  the  probability  of  detection  is  high.  The  performance 
is  dominated  by  the  missed-target  statistics  whereas  the  curves 
remain  at  very  low  level  and  show  only  random  changes.  The  position 
estimate  errors  behave  rather  strangely.  This  again  reflect  the  fact 
that,  when  the  origin  of  each  measurement  is  not  certain,  the  amount  of 
data  is  sometimes  advantageous  but  not  always.  As  seen  in  the 
curves,  when  the  number  of  scans  is  small  (the  decentralized  scheme), 
the  low  probability  of  detection  requires  more  hypotheses  and  thus  more 
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memory  space.  However,  for  the  centralized  and  the  distributed  schemes 
where  the  number  of  scans  is  twice  as  many  as  the  decentralized  scheme, 
this  trend  is  not  so  obvious.  The  CPU  time  requirement  increases  as  the 
probability  of  detection  decreases  in  every  scheme.  However,  the  rela¬ 
tion  among  the  different  schemes  remains  substantially  unchanged. 


6.2.4  Sinmary  of  Results  and  Supplemental  Discussions 


The  observations  from  the  comparative  statistics  obtained  by  the 
varying  key  parameters  may  be  summarized  as  follows: 

1.  In  general,  as  the  parameters  move  to  more  difficult  values,  the 
performance  degrades  accordingly  in  every  scheme.  In  most  cases, 
however,  the  relative  performance  displayed  in  the  baseline  case 
of  the  three  schemes  remains  unchanged. 

2.  When  the  external  conditions  are  severe  (e.g.,  with  high  target 
density,  high  false  alarm  rates,  and  large  measurement  errors),  the 
difference  in  performance  sometimes  becomes  less  obvious.  However, 
there  is  no  cross  over  in  the  performance  ordering  of  the  three 
schemes . 

3.  The  advantage  of  using  less  data  (the  decentralized  Bcheme)  appears 
only  in  saving  the  CPU  time.  Under  mild  external  conditions,  the 
decentralized  scheme  requires  more  memory  space  than  the  other 
schemes . 

4.  The  CPU  time  requirement  increases  rapidly  with  high  target  den¬ 
sity,  high  false  alarm  rates,  large  measurement  error  and  low  prun¬ 
ing  threshold,  both  in  the  centralized  and  the  decentralized 
schemes . 


In  order  to  evaluate  the  performance,  we  have  used  the  two  demerit 
indices,  and  N^.  In  the  baseline  comparison  and  the  subsequent 
sensitivity  studies,  we  have  occasionally  observed  some  bias  toward  more 
N^,  i.e.,  a  trend  to  dismiss  targets  than  creating  false  tracks.  These 
two  indices  correspond  exactly  to  the  two  basic  sensor  parameters,  the 


false  alarm  rate  and  the  probability  of  false  dismissal  (1  -  prob.  of 
detection).  Thus,  like  the  false  alarm  rate  and  the  probability  of 
false  dismissal,  N^,  and  N^  are  complementary.  For  example,  whep  any 
pruning  scheme  is  employed,  there  is  always  some  chance  of  dismissing  a 
"true"  hypothesis  consisting  of  true  tracks  of  the  real  targets.  This 
possibility  has  not  been  considered  in  deriving  our  basic  hypothesis 
evaluation  equations.  Although  how  to  take  care  of  this  i6  an  open 
theoretical  problem,  the  pair  (Np^,  N^)  be  affected  by  changing 
some  parameters  used  in  the  likelihood  functions,  such  as  the  a.  pr iori 
target  density,  the  probability  of  detection  and  the  false  alarm  rates. 
The  last  analysis  in  this  section  shows  how  the  two  indices  change  when 
the  a.  priori  target  density  is  modified  from  its  true  value. 

Figure  6-6  shows  the  performance  variation  when  the  expected  number 

of  targets  used  in  the  algorithm  is  modified  to  ^  from  its  true  value 

M 

vT.  As  expected,  increases  and  N^.  decreases  as  increases.  An 
index  representing  the  overall  performance,  PpA>  has  a  peak  in  every 
scheme.  The  peak  is  however  at  a  different  point  for  each  scheme.  The 
determining  factor  seems  to  be  the  amount  of  data.  Namely,  when  the 
amount  of  data  is  large,  the  chance  of  missing  targets  is  low  so  that 
there  is  less  need  for  large  compensation.  The  increase  of  v^,  however, 
forces  the  GTC  and  the  distributed  GTC  to  maintain  more  hypotheses  and 
to  use  more  computation  time. 

Figure  6-7  Bhows  the  two  dimensional  change  of  (N_,  N,,,)  according 

XL  MT 

M 

to  v^.  The  curves  in  this  figure  may  be  regarded  as  trade-off  curves. 
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Figure  6-7:  Two-Dimensional  Change  of  False  and  Missed  Tracks 


They  may  help  the  user  of  the  GTC  or  the  distributed  GTC  in  choosing  the 
right  parameters.  Other  parameters  vhich  may  be  modified  are  the  false 
alarm  rates  and  the  probability  of  detection.  Moreover,  the  calculation 
of  the  likelihood  functions  for  newly  detected  targets  can  also  be  modi¬ 
fied.  Which  parameters  should  or  should  not  be  modified  is  still  an 
open  question. 

6.3  AM  ALMOST  COMSTAMT  VELOCITY  TARGET  EXAMPLE 

In  this  section  ve  consider  a  class  of  dynamic  target  models  where 
the  velocity  is  almost  constant.  Such  models  can  approximate  targets 
which  do  not  maneuver  during  the  sensor  observation  interval. 

6.3.1  Example  Scenario 

Almost  constant  velocity  target  models  are  next  in  complexity  to 
the  simplest  *"*rget  model,  i.e.,  the  stationary  target  model  described 
in  Section  6.2,  but  are  commonly  used  in  the  multitarget  tracking 
literature.  By  an  almost  constant  velocity  target  model,  we  mean  a 
dynamical  model  in  which  the  position  and  the  velocity  constitute  a 
state  and  the  time-derivative  of  the  velocity  is  a  white  noise  vector 
with  small  intensity.  In  this  section,  we  shall  explore  a  one¬ 
dimensional  almost-constant-velocity  target  model,  in  which  each  target 
state  x(t)  at  time  t  has  a  dynamical  model  given  by: 


x(t)  = 


(6.5) 
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vbere  u(t)  is  the  position,  v(t)  the  velocity  at  time  t,  and  v(.)  i6  a 
Wiener  process  with  Var.(v(At))  =  QAt.  We  assume  two  nodes  each  of 
which  has  a  sensor  measuring  the  velocities  of  targets  as  well  as  their 
positions.  The  node  layout  is  shown  in  Figure  6-8.  The  initial  velo¬ 
city  of  each  target  is  assumed  to  be  much  larger  than  the  intensity  of 
the  white  noise  which  excites  the  state,  and  thus  targets  always  move  in 
the  same  directions.  Therefore,  we  assume  that  each  sensor  has  a  pre¬ 
processor  which  filters  through  only  moving  objects  and  divides  all  the 
measurements  by  their  directions  into  two  separate  sets  and  that  we  do 
not  need  to  associate  measurements  across  the  two  sets.  Moreover,  when 
each  target  is  viewed  as  an  object  moving  in  a  two-dimensional  space 
(position,  velocity),  targets  never  cross  each  other  although  they  may 
pass  each  other.  For  thi6  reason,  in  our  experiments  we  assume  that 
targets  move  only  in  one  direction. 

As  shown  in  the  figure,  the  two  sensors  have  non-overlapping 
f ields-of-view.  The  distance  between  the  two  f ields-of-view  is  assumed 
to  be  the  same  as  the  length  of  the  f ield-of-view  of  each  sensor.  The 
two  sensors  have  identical  characteristics.  The  detection  probability 
for  each  sensor  n  is  given  by: 
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Figure  6-8:  Node  Layout 


P^OO  =  Pd^’V) 


Pn™,„  C  g(u  -u;R  )du  f  g(v  -v ; R  )dv 
Dmax  J  m  ’  u  m  J  °  m  v 

„(")  V 


m 


where 

g(S;R)  «  (/TFr)"1  exp (  -  is  £2/R) 


is  the  one-dimensional  Gaussian  density  function  and 

=  [u(n),  u(n)]  =  f ield-of-view  of  sensor  n 
mm  max 

V  =  [v  .  ,  v  1  =  velocity  interval  observable 

(same  for  both  sensors) 


The  measurement  error  probability  for  sensor  n  is  given  by: 


PMn>(ylx)  =  PMn)(VVmlU’v) 

=  e(um~u;Ru)  S(vm-v;Ry) 

/  8(e-u;Ru)d5jg(n-v;Rv)dn 

u(n)  V  (6.91 


By  choosing  the  left  end  of  the  f ield-of-view  of  sensor  1  as  the  origin, 
ve  have: 


»<;>  -  0, 

min 


u(1>  =  L, 
max 


u<2>  =  2L,  and  u(2)  =  3L. 
min  max 


(6.10) 


Thus,  the  sensor  performance  is  determined  by  p«  in  (6.6)  and  the 
measurement  accuracies 


a  =  fT  and 
u  u 


(6.11) 


0  = 

v 


(6.12: 


The  initial  velocities  of  the  targets  have  the  distribution  shown 
in  Figure  6-9.  The  initial  positions  were  assumed  to  be  uniformly  dis¬ 
tributed  on  an  interval  U  which  is  chosen  so  that  we  could  expect  a 


(At  =  scan  interval) 


Figure  6-9:  A  Priori  Distribution  of  Initial  Velocity 

target  at  any  position  on  [0,  3L]  when  the  first  scan  was  made  and  a  new 

target  could  appear  at  the  last  scan.  We  further  assumed  that  the  two 

sensors  were  synchronized  with  the  identical  scan  interval  At  and  that 

v  •  *  ( -15)  (L./ At )  and  v  ■  .45  (L/At)  so  that  (6.6)  -  (6.9)  made 
min  max 

sense.  The  number  of  false  alarms  from  each  sensor  was  assumed  to  be 
Poisson  with  mean  and  each  false  alarm  is  distributed  uniformly  on 

U<n)  «  V. 

The  baseline  conditions  chosen  for  this  example  are  shown  in  Table 


Table  6-5:  baseline  Parameters  lor  Almost  Constant  Velocity  Case 


The  three  communication  schemes  introduced  in  the  previous  section 
(decentralized,  centralized  and  distributed)  were  examined.  The  Simula 
tion  time  is  chosen  to  be  10  scan  intervals.  In  the  decentralized 
scheme,  each  node  processes  only  the  data  available  to  it.  In  the  cen¬ 
tralized  scheme,  all  the  data  is  processed  in  a  centralized  manner 
either  by  a  single  processing  node  or  redundantly  by  tvo  nodes  with 
every  data  set  exchanged  at  each  scan  between  the  two  nodes.  On  the 
other  hand,  in  the  distributed  scheme,  the  communication  between  nodes 
is  much  less  frequent  and  is  at  the  hypothesis  level  rather  than  the 
sensor  measurement  level.  In  this  example,  the  distributed  scheme  com¬ 
municates  every  5  scans,  i.e.,  only  twice  in  the  total  length  of 


scenario. 

Since  in  this  example  Che  targets  move  only  in  one  direction  (left 
to  right),  ve  cannot  compare  the  node  performance  in  the  decentralized 
scheme  vith  other  schemes  using  the  same  performance  measures  of  the 
previous  example.  For  example,  the  downstream  node  2  may  have  a  chance 
of  seeing  only  half  of  the  targets  getting  into  the  region  and  the  role 
of  the  upstream  node  1  is  asymmetric  vith  respect  to  the  other  node  2. 
For  this  reason,  ve  evaluate  the  decentralized  scheme  by  assuming  an 
additional  ad  hoc  coordination  system  which  integrates  the  outputs  of 
the  two  nodes.  First,  the  best  hypothesis  is  extracted  from  each  node. 
Tracks  in  the  best  hypotheses  from  the  two  nodes  are  then  tested  by  a 
thresholding  modified  rectangular  Munkres  algorithm  [13].  A  track 
from  node  1  and  a  track  from  node  2  are  declared  to  originate  from  a 
single  real  target  when  the  distance  between  the  best  estimates  of  the 
position-velocity  pairs  is  less  than  a  given  threshold.  Let  and 

be  the  hypotheses  from  the  two  nodes  in  the  decentralized  scheme  and  Tg^, 
be  the  6et  of  pairs  (Tj,,  T^)  which  are  judged  to  originate  from  a  single 
target  as  described  above.  Then,  we  consider  the  following  two  sets  of 
tracks  as  outputs  of  the  two  different  coordination  systems: 

(L)  AND-LOGIC  SIMPLE  FUSION 

tTi  u  t2S (ti*  V  e  tst} 

(2)  OR-LOGIC  SIMPLE  FUSION 


tTl 

U 

t2! 

(Tj,  T 

2^ 

€  } 

ST  1 

U{t  t 

e 

V 

there 

is 

no  X j  such 

that 

<v 

t2)  €  TgT 

U{t2 

€ 

X2 

there 

is 

no  t  ^  such 

that 

(v 

t2>  €  TST 
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The  reason  that  we  have  named  such  coordination  as  ‘'simple"  is  apparent, 
as  well  as  the  adjectives  "and-logic"  and  "or-logic".  The  introduction 
of  these  coordination  systems  for  the  decentralized  scheme  serves  a 
two-fold  objective:  to  test  the  effect  of  the  frequency  of  communica¬ 
tion  at  the  hypothesis  level;  and  to  see  how  well  or  poorly  such  simple 
"heuristic"  operations  as  described  above  perform  when  compared  with 
more  well-defined  outputs  from  the  centralized  and  the  distributed 
schemes. 

The  same  set  of  performance  criteria  as  those  described  in  Section 

6.2.2  is  used;  N^,  N^,  PpA>  apoB>  Tfi  and  NTR  shown  on  Table  6-2  are 

used  to  evaluate  the  performance  of  every  scheme.  Since  we  have  the 
velocity  component  in  a  target  state  and  the  velocity  measurement  in 
every  sensor  measurement,  an  additional  criterion,  i.e.,  the  velocity 
estimation  error  0  ^  is  included.  The  evaluation  procedure  for  each 

scheme  is  the  same  as  that  described  in  Section  6.2.2. 

6.3.3  A  Sample  Run 

In  this  section  we  describe  the  results  of  a  single  sample  run. 
First  we  describe  the  data  from  this  run,  and  next  we  summarize  the 
operation  of  each  of  the  three  schemes. 


6. 3. 3.1  Sample  Run:  Data 


Figure  6-10  shows  a  typical  sample  of  the  10-scan  position  measure¬ 
ments  from  two  nodes  under  the  baseline  condition.  There  are  73  meas¬ 
urements  in  total  from  the  two  nodes  (sensors)  in  the  10  scans.  Out  of 
these  73  measurements,  only  ten  originate  from  the  real  targets  while 
the  rest  are  false  alarms.  As  seen  in  this  figure,  the  nominal  condi¬ 
tion  can  be  characterised  as  a  high-f alse-alarm-rate,  low  probability- 
of-detection  case  which  makes  the  tracking  difficult.  From  these  data, 
it  is  almost  impossible  to  extract  tracks  by  human  eye.  Of  course,  this 
is  true  particularly  because  Figure  6-10  shows  only  position  measure¬ 
ments.  Figure  6-11  shows  the  same  sample  where  velocity  measurements 
are  attached.  In  this  figure,  each  velocity  measurement  iB  represented 
by  a  unit-length  arrow  originating  from  its  position  measurement  and 
pointing  to  the  expected  position  at  the  next  scan  time.  Thus  an  arrow 
pointing  straight  down  indicates  zero  velocity  and  an  arrow  pointing  to 
the  right  indicates  infinite  velocity.  With  the  high  false  alarm  rate, 
the  relatively  low  probability  of  detection  and  the  existence  of  the 
masked  region,  it  is  still  very  difficult  for  human  eye  to  extract  tar¬ 
get  trajectories. 

Figure  6-12  shows  the  true  target  trajectories  of  thi6  sample,  and 
Figure  6-13  is  the  superposition  of  the  measurement  data  (Figure  6-11) 
and  the  true  target  trajectories  (Figure  6-12).  Target  1  appears  in  the 
masked  region  and  moves  eventually  into  the  f ield-of-view  of  sensor  2 
where  it  has  the  opportunity  of  being  detected  four  times,  but  is  actu¬ 
ally  detected  only  twice.  The  rest  of  the  targets,  2,  3,  and  4,  appear 


Position  and  Velocity  Measurements 


Target  1 


in  the  field  of  view  of  Sensor  1.  Only  target  2  moves  across  into  the 
f ield-of-view  of  Sensor  2.  Targets  2  and  3  are  detected  whenever  they 
have  a  chance  of  being  detected.  On  the  other  hand,  target  4  is  never 
detected. 

6. 3. 3. 2  Sample  Run:  Decentralized  Scheme 

At  the  end  of  Scan  5,  node  1  confirms  target  2,  i.e.,  all  the 
hypotheses  contain  the  track  of  the  three  measurements  while  the  other 
targets  have  not  yet  been  detected.  By  Scan  10,  target  3  has  been  con¬ 
firmed  by  node  1,  and  targets  1  and  2  have  been  confirmed  by  node  2. 
Thus,  at  the  end  of  the  sample  run,  the  best  hypothesis  contains  three 
tracks  corresponding  to  targets  2  and  3  in  node  1,  and  targets  1  and  2 
in  node  2.  Consequently,  the  simple  AMD-logic  fusion  should  form  one 
track  corresponding  to  target  2  while  the  simple  OR-logic  fusion  should 
form  three  tracks  corresponding  to  targets  1  to  3  if  the  fusion  mechan¬ 
isms  work  properly.  However,  due  to  the  particular  value  for  the  thres¬ 
hold  used  by  the  Munkres  assignment  algorithm,  the  two  tracks  which 
should  be  identified  are  not  judged  to  originate  from  the  same  track. 

As  a  result,  the  simple  AND-logic  fusion  forms  no  track,  i.e.,  no  false 
track  but  four  missed  targets,  while  the  simple  OR-logic  fusion  forms 
four  tracks,  i.e.,  one  false  track  and  one  missed  target. 


6. 3. 3. 3  Sample  Run:  Centralized  Scheme 

At  the  end  of  the  last  scan,  there  are  18  clusters.  Three  of  them 
have  confirmed  tracks  which  correctly  correspond  to  the  real  targets,  1 
to  3.  All  other  clusters  have  only  tentative  tracks  with  small  proba¬ 
bilities.  Thus,  there  is  no  false  track  and  one  missed  target.  Since 
the  missed  target  is  undetected  by  all  the  sensors,  there  is  no  chance 
for  it  to  appear  in  the  output  of  any  scheme. 

6. 3. 3.4  Sample  Run:  Distributed  Scheme 

Dp  to  scan  3,  everything  works  in  exactly  the  same  way  as  in  the 
decentralized  scheme,  i.e.,  target  2  has  been  confirmed  by  node  1  while 
other  targets  have  not  been  detected  yet  and  all  other  tracks  are  tenta¬ 
tive.  At  the  end  of  scan  3,  the  information,  in  terms  of  clusters, 
hypotheses  and  tracks,  from  the  two  nodes  is  "fused".  In  the  first  step 
of  this  fusion,  the  track-to-track  likelihood  function  is  calculated  for 
each  pair  (t^,  of  tracks  from  nodes  1  and  2.  It  turns  out  that  the 
only  feasible  (positive  likelihood)  pairs  are  in  the  form  of  either 
(t^,  <p)  or  (4>,  T^).  This  reflects  the  true  situation  that  no  target  has 
appeared  in  both  of  the  f ields-of-view.  One  of  the  tentative  tracks 
from  node  1  indicates  that,  if  it  had  originated  from  a  real  target,  it 
should  have  been  detected  also  by  node  2.  Since  it  has  not  been 
detected,  this  tentative  track  is  rejected.  Thus,  out  of  7  possible 
cracks  from  node  1  and  8  tracks  from  node  2,  one  (correctly)  confirmed 
track  and  13  tentative  tracks  become  common  tracks  to  both  nodes. 
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Node  2  correctly  extends  the  confirmed  track  when  the  target  enters 
the  f ield-of-view  of  sensor  2.  With  respect  to  the  other  targets,  1,  3 
and  4,  the  distributed  scheme  performs  in  a  way  very  similar  to  other 
schemes  until  the  last  scan  after  which  the  second  information  fusion  is 
performed.  At  the  end  of  the  last  scan  (before  the  second  fusion), 
several  of  previously  fused  tracks  have  been  eliminated  by  each  node. 
There  are  ten  new  tracks  (including  the  confirmation  of  target  3)  from 
node  1  and  2  tracks  (including  the  confirmation  of  target  1)  from  node 
2.  When  the  track-to-track  likelihood  function  is  calculated,  four  pre¬ 
viously  fused  tracks  are  eliminated  because  one  of  the  nodes  has 
rejected  them.  Subsequently,  the  confirmed  and  previously  fused  track 
is  fused  again  with  new  information  being  provided  by  node  2,  while  the 
other  five  previously  fused  tracks  have  not  gained  any  additional  infor¬ 
mation  because  any  real  target  is  not  expected  to  have  appeared  within 
the  f ield-of-view  of  any  sensor.  Thus,  the  inf ormation(clusters, 
hypotheses  and  tracks)  obtained  by  this  second  fusion  is  the  same  as 
that  obtained  by  the  centralized  scheme  except  for  small  errors  in  track 
distribution  parameters,  track  likelihood  and  probabilities  introduced 
by  approximations  in  the  information  fusion  program.  In  this  sample 
run,  the  performance  of  the  centralized  scheme  is  almost  identical  to 
the  distributed  schemes.  This  means  that  the  hypothesis  management  used 
in  the  centralized  GTC,  the  distributed  GTC's  and  the  information  fusion 
program  have  not  altered  the  objective  of  the  distributed  system,  i.e., 
to  produce  the  performance  of  a  centralized  system  while  distributing 
the  tracking  tasks  among  the  two  nodes. 
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6.3.4  Monte  Carlo  Simulation  Results 


Although  what  is  discussed  above  is  only  a  sample,  it  turns  out 
that  this  performance  comparison  of  the  centralized  and  the  distributed 
schemes  is  quite  typical  at  least  for  the  baseline  case.  This  result  is 
in  contrast  to  that  of  the  previous  section  where  we  saw  a  slight  advan¬ 
tage  of  the  distributed  scheme  over  the  centralized  scheme.  The  similar 

\ 

performance  of  the  two  schemes  is  probably  due  to  two  factors.  First, 
only  a  small  fraction  of  targets  appear  in  the  f ield-of-view  of  the  sen¬ 
sor  because  of  the  relatively  large  masking.  Second,  false  alarms  do 
not  disturb  the  performance  greatly  despite  the  relatively  high  density 
because  the  velocities  in  the  measurements  serve  as  strong  discrim¬ 
inants.  Table  6-6  shows  the  base  case  comparison  of  the  four  schemes 
obtained  by  100-run  Monte  Carlo  simulations. 

It  is  obvious  from  Table  6-6  that  the  two  decentralized  schemes 
perform  worse  than  the  centralized  and  the  distributed  cases.  This 
result  shows  a  clear  advantage  of  frequent  communication  over  less  fre¬ 
quent  communication  and  of  rigorous  fusion  algorithms  over  heuristic 
fusion  algorithms.  As  could  be  expected,  the  simple  AND-logic  scheme's 
performance  is  the  worst  because  of  its  extreme  cautiousness.  Since 
there  is  little  chance  that  both  nodes  confirm  the  same  target,  this 
scheme  tends  to  create  many  missed  targets.  On  the  other  hand,  the  sim¬ 
ple  OR-logic  performs  almost  as  well  as  the  centralized  and  the  distri¬ 
buted  schemes  with  respect  to  the  missed  target  statistics.  However, 
since  this  scheme  accepts  non-agreed-upon  tracks  rather  blindly,  the 
number  of  false  tracks  is  exceptionally  high.  However,  when  we  view 
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(*)  Normalized  by  the  standard  deviation  of  the  position 
measurement  error . 

(**)  Normalized  by  the  standard  deviation  of  the  velocity 
measurement  error. 

(***)  Average  execution  time  for  one  run: 


Decentralized : 
Centralized : 
Distributed : 


20  Data  Sets 
20  Data  Sets 

20  Data  Sets  +  2  Fusion  Operations 


^PA’  *-^e  Probability  of  perfect  association,  as  a  single  proxy  of  per¬ 
formance  for  each  scheme,  the  simple  OR-fusion  decentralized  scheme  per¬ 
formance  is  quite  close  to  the  centralized  and  distributed  schemes. 

The  comparison  of  performance  between  the  centralized  and  the  dis¬ 
tributed  schemes  is  not  conclusive  since  both  perform  equally  well.  The 
trend  in  the  computational  requirement  is  very  similar  to  that  we  have 
observed  in  the  previous  example.  The  decentralized  system  uses  much 
less  CPU  time  but  requires  more  space.  The  distributed  scheme  requires 
more  time  and  space  than  the  centralized  scheme.  The  average  estimation 
error  is  comparable  to  the  measurement  error  for  the  velocity  and  is 
about  5  to  9  times  bigger  for  the  position.  This  is  so  because  the 
average  number  of  measurements  in  a  track  is  very  small  due  to  the  low 
probability  of  detection  and  the  large  masking.  The  position  estimation 
errors  are  further  worsened  since  the  random  change  in  the  velocity  is 
accumulated  and  added  to  the  positional  uncertainty. 

Comparative  statistics  were  obtained  by  100-run  Monte  Carlo  runs 
For  each  of  the  six  key  parameters  (i.e.,  the  pruning  threshold,  the 
target  density,  position  measurement  error,  velocity  measurement  error, 
false  alarm  rate  and  probability  of  detection).  The  parameter  values 
used  in  these  tests  are  summarized  in  Table  6-7.  The  results  are  sum¬ 
marized  in  Tables  6-8  to  6-13. 
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- 

Distributed  - 

1 

1 

1 

1 

1  .39 

1 

1 

1 

I 

.08 

1 

1 

1 

.90 

1 

1  5.42 

1 

I 

1  1.80 

1 

i 

133.43 

1 

i 

1  62.8 

1 

1 

1 

.02 

1  .34 

1 

1 

1 

1 

.06 

1 

1 

1 

1.04 

1  5.29 

1 

1  1.70 

1 

111.82 

i 

1 

1  32.2 
| 

1 

1 

.1 

1  .11 

1 

1 

1 

.01 

1 

1 

2.23 

1  5.85 

1  2.01 

1 

I 

1  1.17 

1 

1  3.2 

1 

(*)  Normalized  by  the  standard  deviation  of  the  position 
measurement  error. 

(**)  Normalized  by  the  standard  deviation  of  the  velocity 
measurement  error. 

(***)  Average  execution  time  for  one  run. 
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I  I  I  I  I 

0  I  2.97  I  8.91  I  3.53  I  6.42  |  26.3 

I  I  I  I  I 

.05  128.04  i  9.04  I  3.54  122.90  I  60.6 


-  Decentralized  (Simple  OR  Fusion)  - 


.58 

1  .02 

1 

1 

1 

.52 

1 

1 

6.48 

1 

1 

1.92 

1  3.20 

1 

1  7.3 

1 

.34 

1  .05 

1 

1 

1 

I 

1.04 

1 

1 

I 

5.29 

1 

1 

1 

1.70 

1 

110.93 

1 

1 

1  20.1 
i 

0 

I  2.96 

1 

1 

5.82 

1 

1 

5.77 

1 

1 

2.15 

1 

1  57.4 

1 

1  53.5 

-  Distributed  - 


(*)  Normalized  by  the  standard  deviation  of  the  position 
measurement  error. 

(**)  Normalized  by  the  standard  deviation  of  the  velocity 
measurement  error. 

('«**)  Average  execution  time  for  one  run. 


-  Decentralized  (Simple  OR  Fusion)  - 
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Table  6-12  Sensitivity  to  False  Alan  Rate 
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1 
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2 
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- 
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1 

1 

1 

v 
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- 
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- 

• 

— 

2 

i 
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1 

1 

1 

1 

.21 

1 
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1 

1 
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1 
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1 
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1 

1 
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1 
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1 

1 

1 

1 

1 

1 

8 
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1 
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1 
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1 

1 

1 
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1 
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4 
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| 

1 

1 

1 

.06 
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j 

| 
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1 

1  1.70 

| 
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1 
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1 

8 

1  .22 

1 

1 

1 

.  12 

f  1.40 

| 

1  6.17 

1 

| 

1  1.99 

1 

| 

!  8.57 

1 

23.4  1 

M 

(*)  Normalized  by  the  standard  deviation  of  the  position 
measurement  error. 

(**)  Normalized  by  the  standard  deviation  of  the  velocity 
measurement  error. 

(***)  Average  execution  time  for  one  run. 
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Table  6-13  Sensitivity  to  Probability  of  Detection 
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.9 

1  .46 

1 

1  .29 

1  .56 

1  6.47 
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20.1 

.9 

1  .51 
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1 
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-  Distributed  - 
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(*)  Normalized  by  the  standard  deviation  of  the  position 
measurement  error. 

(**)  Normalized  by  the  standard  deviation  of  the  velocity 
measurement  error. 

(***)  Average  execution  time  for  one  run. 


6. 3. 4.1  Pruning  Threshold 


Table  6-8  examines  the  effect  of  the  pruning  threshold.  The  cen¬ 
tralized  and  distributed  schemes  perform  quite  equally  and  show  the  same 
tendencies.  With  a  low  pruning  threshold,  both  schemes  perform  better 
but  require  more  computational  resources,  and  with  a  higher  threshold, 
perform  worse  but  with  less  resources.  On  the  contrary,  it  is  more  dif¬ 
ficult  to  explain  how  the  two  decentralized  schemes'  performance  varies 
with  the  pruning  threshold.  The  worse  performance  under  high  pruning 
threshold  is  naturally  expected.  However,  the  two  decentralized  schemes 
also  perform  worse  than  the  baseline  with  the  lower  pruning  threshold. 
This  result  is  probably  due  to  the  failure  of  simple  fusion  schemes  used 
in  the  two  decentralized  schemes,  or  the  infrequent  communication  which 
tends  to  produce  very  different  situation  assessments  among  the  two 
nodes,  or  a  combination  of  these  two  factors.  4s  far  as  the  computa¬ 
tional  requirements  are  concerned,  the  two  decentralized  schemes  behave 
normally. 

6. 3. 4. 2  Target  Density 

The  response  to  the  change  in  the  target  density,  shown  in  Table 
6-9,  also  displays  a  similar  trend.  When  the  target  density  is  high,  we 
see  a  clear  difference  in  performance  between  the  two  groups  of  schemes, 
i.e.,  (1)  the  two  decentralized  schemes  and  (2)  the  centralized  and  dis¬ 
tributed  schemes.  When  the  target  density  is  low,  however,  the  differ¬ 
ence  in  performance  becomes  less  obvious.  Even  the  two  decentralized 
schemes  perform  well  since  the  chance  of  missing  targets  is  very  low. 
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On  the  other  hand,  with  a  high  target  density,  the  two  decentralized 
schemes'  performance  obviously  deteriorates.  As  in  the  previous  com¬ 
parison,  the  responses  of  the  centralized  and  the  distributed  schemes 
are  very  similar. 


6. 3. 4. 3  Other  Parameters 

The  effect  of  measurement  errors,  shown  in  Tables  6-10  and  6-11. 
shows  more  or  le6s  expected  general  trends.  Namely,  the  larger  the 
measurement  error,  the  more  confusing  the  measurements  become  and  the 
more  the  performance  of  all  of  the  schemes  degrades.  However,  the 
response  of  the  two  decentralized  schemes  is  rather  flat  compared  with 
the  centralized  and  distributed  scheme.  This  indicates  a  certain  degree 
of  robustness  of  the  relatively  simple  decentralized  schemes  and  their 
inability  to  take  advantage  of  the  improved  external  condition  (due  to 
the  infrequent  communication  and  use  of  a  rather  heuristic  fusion 
scheme).  In  the  last  two  comparisons,  shown  in  Table  6-12  (false  alarm 
rate)  and  Table  6-13  (probability  of  detection),  we  see  generally  the 
same  trend.  However,  as  we  have  observed  in  other  comparative  statis¬ 
tics,  the  performance  of  the  two  decentralized  schemes  does  not  change 
much.  In  particular,  the  simple  AND-logic  responds  to  the  external  con¬ 
ditions  in  a  direction  opposite  to  the  expected  direction. 
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6.4  SUMMARY  OF  NUMERICAL  EXAMPLES 


This  section  summarizes  the  results  obtained  in  the  two  numerical 
examples.  The  two  simple  examples  were  chosen  so  that  ve  could  isolate 
the  basic  characteristics  of  each  of  the  three  communication  schemes 
more  readily.  These  examples,  however,  represent  two  radically  dif¬ 
ferent  situations. 

Example  1: 

The  targets  were  stationary  and  both  nodes  had  the  same  field-of- 
view.  Thus,  there  was  a  high  degree  of  informational  overlap  (i.e., 
redundancy),  and  each  node  could  have  performed  tracking  adequately  by 
itself  under  normal  conditions.  This  situation  results  in  fairly 
respectable  performance  for  the  decentralized  scheme  (no  communication) 
On  the  other  hand,  since  the  data  from  the  two  nodes  are  highly  corre¬ 
lated  (being  from  the  6ame  set  of  stationary  targets),  the  performance 
is  sensitive  to  the  frequency  and  type  of  communication  (decentralized 
versus  centralized  versus  distributed).  Furthermore,  since  correlated 
data  implies  the  storage  and  evaluation  of  a  large  number  of  hypotheses 
any  change  in  a  key  parameter  such  as  the  pruning  threshold  affects  the 
performance  differently  for  different  schemes. 

Example  2: 

On  the  contrary,  in  Example  2,  the  targets  were  moving  in  one 
direction  and  the  fields  of  view  of  the  sensors  in  the  two  nodes  were 
disjoint,  therefore,  the  two  nodes  received  mostly  independent  and 


complementary  information.  In  6uch  a  case,  communication  becomes  essen¬ 
tial  and  can  improve  the  performance  substantially.  Particularly,  when 
each  node  is  operating  independently  (distributed  or  decentralized 
scheme),  proper  coordination  (both  communication  and  information  fusion) 
becomes  more  important.  Moreover,  since  targets  were  moving,  the  timing 
of  communication  would  be  quite  critical.  For  example,  in  a  distributed 
scheme,  a  set  of  measurements  may  be  erroneously  dismissed  as  false 
alarms  if  it  is  obtained  right  after  a  communication  instant;  the  node 
would  have  to  wait  a  long  time  before  the  next  round  of  communication. 
The  same  set  of  measurements  could  have  been  confirmed  correctly  to  have 
come  from  a  target  if  the  other  node  could  communicate  the  presence  of  a 
track  at  a  crucial  time. 

In  short.  Example  1  is  a  case  where  the  centralized  scheme  is 
vulnerable  because  of  the  large  amount  of  correlated  data,  while  Example 
2  is  one  where  the  distributed  scheme  is  vulnerable.  As  a  result,  we 
have  observed  a  slight  advantage  of  the  distributed  scheme  over  the  cen¬ 
tralized  scheme  in  Example  1,  and  almost  identical  performance  in  Exam¬ 
ple  2.  When  we  consider  the  non-zero  pruning  threshold  as  a  proxy  of 
the  computational  resources,  the  results  of  Example  1  clearly  show  an 
advantage  of  distributed  processing.  The  external  environment  in  Exam¬ 
ple  2  is  generally  unfavorable  to  the  distributed  scheme.  However,  the 
results  indicate  that  with  proper  coordination  such  as  the  information 
fusion  algorithm  used  in  the  distributed  GTC,  the  same  performance  as 
the  centralized  scheme  can  be  achieved. 


From  both  examples,  we  notice  the  quality  of  information  is  as 
important  as  the  quantity  of  information.  We  have  found  that  when  the 
quality  of  information  is  low  (e.g.,  when  the  false  alarm  rate  is  high), 
the  quantity  of  information  does  not  positively  correlate  with  the  per¬ 
formance.  This  observation  is,  to  some  degree,  contrary  to  the  results 
of  conventional  filtering  systems  where  it  is  always  beneficial  to  have 
more  data. 

It  is  difficult  to  draw  general  conclusions  from  the  simulation 
experiments  since  performance  depends  on  environmental  factors  (target 
density,  false  alarm  rates,  measurement  error)  as  well  as  computing 
resources  (pruning  threshold).  Based  on  the  simulation  results,  how¬ 
ever,  we  can  draw  the  following  conclusions.  When  the  amount  of  data  is 
large  and  the  quality  is  low  (high  target  density,  low  detection  proba- 
blity,  and  high  false  alarm  rates),  a  distributed  scheme  where  only 
hypotheses  are  communicated  is  generally  prefered  since  the  amount  of 
data  handled  at  each  node  is  smaller.  The  advantage  can  only  be  real¬ 
ized,  however,  if  proper  coordination  in  terms  of  communication  times 
and  fusion  algorithms  is  used.  An  ad  hoc  coordination  algorithm,  such 
as  that  used  in  the  decentralized  schemes  in  Example  2,  may  not  perform 
well.  These  advantages  are  in  addition  to  others  such  as  reliabiliy, 
cost,  etc.  mentioned  in  the  introduction  of  this  report. 
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7 .  CONCLUSIONS 


In  this  report,  we  have  described  the  results  of  our  research  on 
the  distributed  situation  assessment  problem  using  a  distributed  sensor 
network.  We  had  two  specific  goals  for  our  research: 

-  investigate  techniques  of  hypothesis  representation,  formation, 
evaluation,  etc.,  in  a  distributed  sensor  network; 

-  investigate  various  tradeoffs  such  as  computation  versus  communica¬ 
tion,  and  the  performance  of  centralized,  decentralized  and  distri¬ 
buted  structures  as  a  function  of  various  parameters. 

Although  we  dealt  mostly  with  general  but  highly  idealized  models, 
the  tracking  and  classification  of  multiple  targets  in  a  low  signal-to- 
noise  ratio  and  high  cluttered  environment  was  chosen  as  an  application 
area  to  focus  our  attention.  Our  approach  had  been  both  analytical  and 
heuristic.  Exact  algorithms  were  developed  using  precise  mathematical 
models  and  combined  with  more  heuristic  rules  in  their  implementation. 
Simulation  experiments  were  also  conducted  to  understand  issues  which 
are  not  amenable  to  analytic  studies. 

To  provide  a  mathematical  foundation  for  the  multitarget  tracking 
problem,  we  have  developed  a  theory  for  multitarget  tracking  and  clas¬ 
sification.  The  centralized  version  has  been  implemented  in  the  form  of 
the  Generalized  Tracker/Classifier,  which  includes  many  existing  trackers 
as  special  cases.  This  theory  addresses  the  issues  of  how  hypotheses 
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should  be  represented,  formed,  evaluated  and  managed  in  the  processing 
of  local  data. 

The  centralized  algorithm  was  decomposed  to  obtain  the  Distributed 
Generalized  Tracker/Classifier.  The  processing  architecture  at  each 
node  was  specified  and  consists  of  the  following  three  modules:  the  Gen¬ 
eralized  Tracker/Classifier  for  processing  of  local  sensor  data,  the 
information  fusion  module,  and  the  information  distribution  module.  We 
have  thus  addressed  many  of  the  issues  associated  with  information 
integration  in  a  network.  The  general  problem  of  distributed  estimation 
by  a  network  of  agents  has  also  been  considered.  Algorithms  which  allow 
each  agent  to  integrate  or  fuse  the  information  from  other  agents 
without  redundant  use  of  the  same  information  have  been  devised. 

The  algorithms  have  been  tested  on  two  different  scenarios  to 
evaluate  their  performance.  The  sensitivities  of  the  performance  of 
various  communication  schemes  to  several  parameters  were  investigated. 

We  found  that  having  more  data,  as  in  a  centralized  situation,  is  not 
necessarily  better  unless  resources  are  available  to  process  the  data. 

In  general,  the  quality  of  the  information  is  more  important  than  the 
amount  of  data.  With  a  properly  coordinated  distributed  scheme,  where 
only  hypotheses  are  communicated,  performance  similar  to  that  of  the 
centralized  scheme  can  be  achieved. 

Our  research  has  addressed  some  of  the  basic  issues  related  to  the 
design  and  operation  of  a  DSN.  Specifically,  we  have  developed  algo- 


rithms  for  distributed  multitarget  tracking  and  classification,  investi¬ 
gated  their  performance  and  compared  it  with  other  communication 
schemes.  To  fully  capitalize  on  the  potential  of  a  DSN,  we  need  to 
address  some  other  issues.  These  include: 

-  how  to  handle  large  networks  with  many  heterogeneous  sensors 

-  how  to  schedule  the  communication  among  the  nodes  efficiently 

-  how  to  allocate  the  sensor  resources  to  optimize  the  performance  of 
the  network 

-  how  to  make  the  nodes  adapt  to  changing  network  conditions  such  as 
failures  of  nodes  and  communication  links 

-  how  to  evaluate  the  performance  of  such  a  distributed  system 

-  how  to  reduce  the  vulnerability  of  the  DSN  to  hostile  activities 
Some  of  these  issues  can  be  addressed  mathematically,  while  others  have 
to  be  handled  by  more  heuristic  or  symbolic  techniques  such  as  artifi¬ 
cial  intelligence.  In  addition,  the  real  time  implementation  of  these 
algorithms  in  a  node  also  poses  some  very  relevant  problems  in  computer 
(VLSI)  architecture  design. 
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MULTITARGET  MULTISENSOR  TRACKING  PROBLEM  -  PART  1: 

A  GENERAL  SOLUTION  AND  A  UNIFIED  VIEW  ON  BAYESIAN  APPROACHES 


S.  Mori,  C.Y.  Chong,  E.  Tse,  and  R.P.  Wishner 

ABSTRACT 

Based  upon  a  general  target/sensor  model,  a  very  general  solution  to 
the  multitarget  tracking  problem  is  derived.  When  this  solution  is  applied 
to  a  special  class  of  models  consisting  of  independent,  identically  distr- 
buted  (i.i.d.)  target  models,  a  less  general  but  more  implementable  class 
of  multitarget  tracking  algorithms  is  obtained.  Some  existing  algorithms  are 
then  examined  based  upon  a  unified  view  created  by  our  derivation  of  general 
tracking  algorithms.  Part  1  covers  most  of  the  analytic  results,  while  in 
Part  2,  hypothesis  management  and  other  issues  pertaining  to  implementation 
of  multitarget  tracking  algorithms  are  discussed  with  a  simple  numerical 
example. 


I.  Introduction 


During  the  past  decade,  the  multitarget  tracking  problem  has  attracted 
a  great  number  of  researchers,  especially  in  the  fields  of  control  and  esti¬ 
mation.  The  problem  is  both  theoretically  interesting  and  very  important  in 
terms  of  applications.  Technically,  it  calls  for  a  new  body  of  theory  or  a 
large  collection  of  standard  techniques  from  various  fields  such  as  modelling, 
stochastic  inference,  nonlinear  filtering,  etc.  Its  wide  range  of  applications 
includes  anti-missile/aircraft  defense,  air  traffic  control,  ocean/battlefield 
surveillance,  etc.  Past  achievements  in  this  area  are  well  documented  in  the 
survey  paper  by  Bar-Shalom  [1]  and  the  Naval  Ocean  Surveillance  Correlation 
Handbooks,  [2]  and  [3].  The  introductory  section  of  the  paper  by  Reid  [A] 
also  contains  a  short  but  excellent  survey. 

Despite  many  efforts  in  this  area,  the  present  stage  of  research  may  well 
be  characterized  as  an  unorganized  collection  of  numerous  "named"  or  "unnamed" 
algorithms.  An  attempt  to  create  a  unified  view  of  these  algorithms  is  des¬ 
cribed  in  a  recently  published  paper  [5].  However,  the  focus  is  on  the  relation¬ 
ship  between  multitarget  tracking  and  other  new  topics  such  as  event-driven 
linear  systems,  etc.,  and  on  the  interpretation  of  Reid's  algorithms  described 
in  [4].  The  object  of  our  paper  is  to  provide  a  general  Bayesian  solution 
to  a  general  but  mathematically  rigorous  model  and  to  provide  a  unified  view 
of  Bayesian  approaches  to  the  multitarget  tracking  problem.  In  doing  so,  we 
may  have  clearer  interpretations  of  many  existing  algorithms  and  a  better 
understanding  of  what  is  necessary  for  future  theoretical  developments  in  this 
area. 
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In  short,  the  multitarget  tracking  problem  is  concerned  with  tracking  an 
unknown  number  of  targets  using  noisy  measurements  whose  origins  are  not  certain 
and  which  may  not  originate  from  any  target  at  all  (false  alarms,  clutters, 
etc.)-  The  basic  and  crucial  deviation  from  conventional  estimation  problems 
is  the  fact  that  targets  (objects  to  be  tracked)  as  well  as  measurements  (returns, 
sensor  outputs)  are  modeled  properly  only  when  they  are  considered  as 
random  sets  in  the  sense  defined  in  [6].  Namely,  (1)  the  number  of  targets, 
the  number  of  measurements,  etc.,  are  random  and  (2)  the  targets  and  the  mea¬ 
surements  are  essentially  unordered  tuples.  For  example,  targets  do  not  have 
a  priori  labels  and  the  measurement  tuple  (a,b)  has  the  same  meaning  as  (b,a). 

We  may  (tentatively)  call  such  a  nature  a  random-set  property  or  feature. 

In  other  words,  one  of  the  fundamental  features  of  multitarget  tracking  is  the 
random-set  feature.  Thus  the  uncertainty  of  origins  of  the  measurements  data 
is  naturally  modeled  as  a  stochastic  system  which  converts  a  random  set  (the  set 
of  targets)  into  many  other  random  sets  (the  measurement  data  sets) . 

Theories  of  random  sets  are  mainly  concerned  with  uncountable-set-valued 
random  sets  and  are  mathematically  highly  sophisticated.  Fortunately,  when  we 
restrict  ourselves  to  random  sets  whose  cardinalities  are  finite  with  probability 
one,  we  can  still  apply  standard  probabilistic  techniques.  For  example,  a 
random  finite  set  X  of  reals  can  be  probabilistically  completely  described  by 
specifying  a  probability  Prob.  {(f(X)"n }  (In  this  paper,  #(A)  is  the  cardinality 
of  a  set  A.)  for  each  nonnegative  integer  n  and  a  joint  probability  distribution 
density  function  p^(x^, . . . ,x^)  of  elements  of  the  set  for  each  positive  n.  In 
order  for  this  specification  to  be  appropriate,  we  must  require  every  pn  to  be 
interchangeable  (permutable) .  This  is  the  basic  approach  which  we  take  in  this 
paper.  As  in  almost  all  the  existing  literature  on  multitarget  tracking,  the 
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basic  task  is  to  hypothesize  the  origin  of  each  measurement  and  to  evaluate 
every  possible  hypothesis,  or  in  other  words,  to  create  and  rank  all  the  possible 
combinations.  Although  one  may  discern  some  similarity  between  multitarget  track¬ 
ing  algorithms  and  classical  hypothesis  testing  formula,  especially  chi-square 
testing,  the  difference  in  underlying  models  is  very  obvious.  To  achieve  this 
basic  task,  we  propose  new  definitions  for  tracks  and  hypotheses,  which  we 
believe  are  both  mathematically  rigorous  and  intuitively  appealing,  and  in  fact 
are  included,  at  least  implicitly,  in  almost  all  the  existing  multitarget 
tracking  algorithms. 

In  many  cases,  in  order  to  broaden  one’s  perspective  and  obtain  deeper 
understanding,  it  is  best  to  start  with  a  general  model  and  go  into  greater 
detail  later  with  a  more  restrictive  class  of  models.  In  the  rest  of  this 
paper  we  will  proceed  according  to  this  philosophy,  defining  a  fairly  general 
model  in  the  next  section  and  following  that  with  two  sections  in  which  the 
definition  and  the  Bayesian  evaluation  of  hypotheses  are  described.  Then  we 
will  discuss  an  important  subclass  of  problems,  i.e.,  what  we  may  call  i.i.d. 
(Independent,  identically  distributed)  target  models.  The  importance  of  this 
subclass  is  two-fold:  (1)  It  provides  us  with  a  set  of  implementationally 
feasible  algorithms;  (2)  A  unified  view  of  existing  algorithms  will  emerge. 

Part  1  covers  most  of  the  theoretical  issues  whereas  Part  2  describes  hypothe¬ 
sis  management  and  other  implementation  issues  with  a  simple  numerical  example. 
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In  our  terminology,  a  target  is  a  generic  name  for  the  smallest  unit  of 
object  which,  when  detected  by  a  sensor  at  a  certain  time,  generates  some 
measurement (s)  in  the  sensor's  output  with  a  certain  probability.  In  our  gene¬ 
ral  model,  all  targets  of  interest  are  modeled  as  one  entity  rather  than  as  a 
collection  of  individual  targets.  Formally,  a  target  system  state  at  time  t 
is  a  realization  (X(t),NT(t))  at  time  t  of  a  continuous-time  stochastic  process 
(X,  N^,)  on  a  target  system  state  space  which  is  the  direct-sum  space 

00 

U  X  *{n} 
n=0  n 

of  a  system  {j£  }  of  hybrid  sets.  By  a  hybrid  set,  in  this  paper,  we  mean  a 
n 

direct  product  space  of  a  subset  of  a  Euclidean  space  (called  continuous  part) 
and  a  finite  set  (called  discrete  part) .  The  use  of  hybrid  sets  allows  us  to 
consider  different  kinds  (types)  of  targets,  sudden  structural  changes  in 
dynamics  (maneuvering  targets),  changes  in  operational  modes,  etc.,  as  well  as 
the  usual  physical  states  such  as  positions  and  velocities.  The  second  element, 
N.p,  represents  the  total  number  of  targets  in  the  system.  When  n=0,  3fn  is 
defined  as  {0}  where  0  is  merely  a  symbol  for  "no  target"  and  0£jf“n  for  all  n. 
For  this  paper,  we  make  the  following  assumption: 


The  component  of  the  target  system  state  (X,N^)  is  a  constant  but 
random  nonnegative  integer  with  a  given  probability  distribution.  For  each 
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positive  integer  n,  given  N  =n,  X  is  a  time-homogeneous  Markov  process  on  DC 

T  n 

associated  with  an  initial  distribution 


Q^(dX)  =  Prob.{X(t0)edx|NT=n> 


(1) 


for  a  fixed  t^  and  a  transition  probability 

(dx|x)  «  Prob. (X(t+At) edX |x(t)=X,  N^n)  (2) 


for  each  (X,At)e3C  x(0,°°)  and  each  t£[tA>”). 

n  u 


The  time-homogeneity  (stationary  transition)  assumption  can  be  easily 
removed  but  helps  the  notational  complexity  in  this  paper.  The  requirement 
for  Nt  to  be  constant  is  not  very  restrictive.  For  example,  to  consider  the 
possibility  of  disappearing  targets  we  may  include  a  component  such  as 
{active,  inactive}  in  the  target  system  state  space  and  construct  an  appropriat 
birth-death-type  Markov  process. 


For  each  positive  n,  we  assume  that  the  component  of  the  target  system 
state  space  is  further  decomposed  into  two  parts  as 


DC  =  *2* 


n  n 


where  is  the  space  for  the  part  representing  the  common  target  state  and 


•**n  iS 


the  direct  product 


n  i 

of  n  identical  individual  target  state  spaces ,  OC  •  A  simple  example  is  the  one 


in  which  3f^={0)  (no  common  state  space) and  3T^  is  a  Euclidian  space.  Another 
example  is  a  model  with  3T^=R^ (the  set  of  pairs  of  reals),  in  which 
(Xq.x^ . Xn^e"*n  rePresenCs  a  target  system  state  for  a  group  with  a  hypothe¬ 

tical  centroid  xQ  and  x ^  being  the  deviation  of  the  i-th  target  from  xQ.  The 
inclusion  of  such  a  component  3?^  is  necessary  in  order  to  model  the  random-set 
nature  of  targets.  For  this  purpose,  we  must  require  a^  priori  interchangeability 
of  individual  targets  as  precisely  defined  in  Assumption  2  below.  In  the  rest 
of  this  paper,  we  call  a  function  II:  3?n  **'3?  a  n-target  permutation  homeomorphism 

(induced  by  permutation  tt)  if,  for  every  (X_, (X. )”  .)e3f  *  j 

C  i  i“l  n  n  n 

with  a  permutation  tt  on  {l,...,n}. 


Assumption  2,  we  can  choose  any  one  of  the  permutations  and  assume  that  the 
order  of  the  targets  is  given  in  that  way.  Under  Assumptions  1  and  2,  we  can 
construct  a  wide  range  of  target  models,  including  those  in  which  targets  move 
in  a  group  rather  than  individually,  i.e.,  their  motions  are  not  independent 
but  correlated. 


B.  Sensor  Model 

Let  S  be  a  finite  set  of  sensors  in  the  system.  In  this  paper,  each 

sensor  seS  is  modeled  as  a  generic  mechanism  which  observes  the  target  system 

state  space  and  generates  a  finite  set  of  measurements ,  called  a  data  set, 

intermittently  according  to  a  certain  sampling  pattern.  Each  measurement  in 

a  data  set  from  a  sensor  s£S  is  an  element  of  the  measurement  value  space  qys 

which  is  a  hybrid  space  with  the  direct-product  measure  p  of  the  Lesbegue 

s 

measure  on  the  continuous  part  and  the  counting  measure  on  the  discrete  part. 

The  continuous  part  of  QJ  is  used  for  analog  information  such  as  positions  and 

s 

velocities  whereas  the  discrete  part  is  for  feature-type  information  such  as 
size/cross-section  classification  of  aircraft  radar  images,  track/wheel 
classification  of  ground  vehicle  images,  etc. 


Formally  a  data  set  is  a  random  element  ((y^)^ 


t,  s)  in  the  data 


set  space 


U  U  (<yc)m*{m}x[t  ,®>x{s}. 
m®0  seS 

where.  (  QJ  )m*  QJ  x....xq/  when  m>0  and  (  )°={0}  (0  is  a  symbol  for  "no  mea- 

s  •  S  _  s,  s 


M 

ment.")  A  quadruple  ( (y^  ) .N^,  t ,  s)  in  this  space  is  interpreted  as  a  data  set 
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generated  by  sensor  s  at  time  t  and  containing  N  measurements,  y  , . . .  ,y 

M 

In  our  generic  sensor  model,  the  generation  of  data  sets  by  sensors  is  modeled 
by  a  four-step  mechanism:  (1)  detection,  (2)  number-of-false-alarm  generation, 
(3)  random  assignment  and  (4)  measurement  value  generation.  First  we  assume 


a  certain  sensor  scheduling  mechanism  which  determines  what  sensor  is  activated 

N  (t  s) 

when.  Once  a  sensor  seS  is  activated  at  time  t  a  data  set  ((y^ (t,s))  ’  , 

Nw(t,s),  t,  s)  is  generated  instantaneously  through  the  following  mechanisms: 


(1)  Detection: 

A  detected  target  set  is  a  unit  which  generates  one  measurement  in  the 
sensor's  output.  Such  a  set  is  modeled  by  a  detected  target  set  collection 
which  is  a  random  collection  D(t,s)  of  nonempty  subsets  of  positive  integers 
such  as 


Prob.(UD(t,s)CiT|NT}  =  1  ,  (6) 

where 

■  IT  -  Cl . ,Nt}  •  (7) 

^il,*2^e^t,S^  means  the  i^-th  and  the  i^-th  targets  are  detected  and 

create  one  measurement  in  the  data  set.  Thus  U^(t,s)  is  the  set  of  all  the 
detected  targets.  The  random  nonnegative  integer  defined  by 


N^(t , s)  -  lf(D( t,s)) 

is  called  the  number  of  detected  target  sets. 


(8) 
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(2)  Number  of  False  Alarms 

A  measurement  in  a  data  set  is  called  a  false  alarm  (a  generic  name  for 
clutter,  false  return,  etc.)  if  it  does  not  originate  from  any  target.  In  our 
model,  the  origins  of  non-false-alarm  measurements  are  D(t,s)  and  the  number 
of  false  alarm  measurements  is  represented  by  a  random  nonneagative  integer 
N  (t,s).  Thus  the  random  nonnegative  integer  N  (t,s)  is  determined  by 

FA  fl 

NM(t,s)  «*  ND(t,s)  +  NFA(t,s)  .  (9) 


(3)  Random  Assignment 


Each  of  the  N  (t,s)  measurements  in  the  data  set  originates  from  one  of  the 
M 

N  (t,s)  detected  target  sets  or  is  one  of  the  N  (t,s)  false  alarms.  Each 

U  In 

origin  is  determined  by  a  random  assignment  A(t,s)  which  is  a  one-to-one 


integer  -valued  random  function  such  that 


Prob.{Dom(A(t,s))=D(t,s),Image(A(t,s))cJM(t,s)|NM(t,s)}  **  1  (10) 

where 

J  (t,s)  =  {l,....,NM(t,s)},  (ID 

M  M 

and  Dom(f)  and  Image(f)  are  the  domain  and  the  image  of  a  function  f.  Define  a 
random  set  JFA(t,s)  by 

J_.(t,s)  *  J  (t , s)  \  Image (A(t, s) ) .  (12) 

FA  M 

deD(t,s)  and  A(t,s)(d)«j  means  that  the  j-th  measurement  originates  from  a 

detected  target  set  d,  whereas  j£JFA(t,s)  means  the  j-th  measurement  is  a  false 
alarm. 
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(4)  Measurement  Values 


Finally,  given  the  number  of  measurements,  N  (t»s),  and  the  origin  of  each 

NM(t,s) 

j  in  JM(t,s),  the  measurement  value  vector  is  generated,  com¬ 

pleting  the  data-set-generating  mechanism. 

For  the  rest  of  this  paper,  the  following  five  assumptions  are  made: 

Assumption  3:  (Known  Exact  Timing) 

Every  sensor  generates  a  finite  number  of  data  sets  within  any  finite  time 
interval.  The  time  at  which  any  data  set  is  generated  is  exactly  known  and  com¬ 
pletely  determined  by  each  individual  sensor  (not  by  any  other  factor  cor¬ 
related  with  the  target  system  state) . 

Assumption  4:  (Memory-less  Sensor) 

There  is  no  memory  in  any  sensor,  so  that  any  single-data-set  statistics 
conditioned  on  the  current  target  system  state  and  any  other  statistical  condi¬ 
tion  are  the  same  as  the  ones  conditioned  only  on  the  current  target  system  state. 

Assumption  5:  (No  Merged  Measurements) 

No  measurement  in  any  data  set  from  any  sensor  originates  from  two  of  more 
targets. 

Assumption  6:  (No  Split  Measurements) 

No  target  generates  more  than  one  measurement  in  any  data  set. 

Assumption  7:  (Random  Order) 

The  order  of  measurements  in  any  data  set  contains  no  information  about  the 
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target  system  state. 


Assumptions  3  and  A  are  standard  in  filtering  problems  and  allow  us  to 
use  standard  techniques  used  in  sampled-data  or  discrete-time  filtering  problems. 
Assumptions  5  and  6  imply  that,  for  any  (t,s)e[t0>®)xs,  D(t,s)  is  a  disjoint 
collection  of  singletons  so  that  we  can  define  a  random  set  I^(t,s)  by 


ID(t,s)  =  {i | {i}eD(t,s) } 


(13) 


such  that  Prob.  {lp(t,s)ClT|NT)>*l  and  a  binary  random  function  FpCt.s)  by 


FD(t,s)  -  xd  ;  iD(t,s)) 


(1A) 


where  x(*IA)  is  the  indicator  function  of  a  set  A.  The  random  function  FD(t,s) 
is  called  the  detection  function.  Then,  due  to  Assumptions  3  and  A,  the  detection 
mechanism  is  completely  modeled  by  specifying  the  detection  probability  function 


PD(6|x,n,t,s)  -  Prob.{FD(t,s)-6|x(t)«X,NT-n}  (15) 

OO 

for  every  (6,X,n,t,s)e  [J  0(n)x  DC  x{n}*[t  ,«)xS  such  that  Z  P  (6  |x,n,  t,s)«l 

n-0  n  0  6e0(n)  U 

for  any  positive  n,  where  S)(n)  is  the  set  of  all  the  binary  function  defined  on 
{l,...,n}.  The  same  couple  of  assumptions  allows  us  to  describe  the  number-of- 
false-alarms  generation  by  specifying  the  number-of-false-alarms  probability 
function 


pN  (m|6,X,n,t,s)  -  Prob. {N  (t ,s)-m|F  (t, s)-6,X(t)-X,N  -n}  .  (16) 


Let  ^  (I.J)  be  the  set  of  all  the  one-to-one  function  defined  on  I  taking 
values  in  J.  Then  we  have 

Prob.  (A(t,s))e  ^(Dft.s)  ,  J^(t  ,s))  |D(c,s)  ,  JH(t,s)  }  =  1  (17) 

and  Assumption  7  implies 

Prob. {A(t,s)=a  I  D(t,s) ,JM(t,s)}  =>  Prob. {A(t, s)=a' |D(t,s) , JM(t,s) }  (18) 

for  all  pairs  (a, a')  of  elements  in  ^4° ( D(t,s),J  (t,s)).  Hence  we  have 

M 

Prob.{A(t,s)-a|N  (t,s),D(t,s),X(t),N  }  =  ll 1 - V-.-Ll  (19) 

(NM(t,s))  ! 

for  each  a  in  ot°(D(t,s) , JM(t,s)) .  Finally  the  sensor  model  is  completed  by 
specifying  the  measurement  value  probability  density  function  Pw  defined  by 

- M 

PM^y  lot»m»'5»x»n»t,s)vi“(dy)  =  Prob.  {yedy|A(t,s)=a,NM(t,s)=*m,FD(t,s)=6,X(t)=X,NT=n} 

(20) 

for  every  (y,a,m,  6,X,n,t,s)  where  p  is  the  m-tuple  direct-product  measure  of 

s 


It  is  clear  that  our  general  sensor  model  as  well  as  our  general  target  model 
allows  us  to  consider  a  variety  of  modem  sensor  systems.  One  should  particu¬ 
larly  note  that  the  probability  of  detection  is  generally  dependent  on  the  target 
system  state.  Therefore,  the  absence  of  returns  is  at  least  potentially  as  infor¬ 
mative  as  their  presence.  For  example,  for  a  sensor  monitoring  the  radio  communi¬ 
cation  of  target(s),  the  probability  of  detection  is  zero  when  the  equipment  is 
shut  off,  and  the  on/off  of  such  equipment  should  be  included  in  the  target 


system  state.  In  particular,  N  “0  or  no  measurement  is  a  data  set  in  our  model 
and  considered  a  potential  piece  of  information.  It  should  also  be  noted  that 
in  our  gerieral  sensor  model  every  data  set  contains  "number-of-measurements" 
information,  and  hence,  every  sensor  is  a  type  1  sensor  in  Reid’s  terminology 
in  (4).  A  type  2  sensor  in  his  terminology  is  a  sensor  which  creates  data  sets 
with  N  (number-of-measurement)=0  or  1  with  probability  one  in  our  model  and  is 
not  (at  least  in  principle)  treated  separately.  Of  course,  Assumption  3  has  a 
crucial  role  in  such  a  treatment  as  ours. 

For  sensor  systems  which  involve  measurement  time  delays  dependent  on 
the  target  state  (e.g.,  acoustic  sensor  systems  described  in  [7]),  a  straight¬ 
forward  model  in  which  the  target  state  is  a  pair  (position, velocity)  and  the  sensor 
measurements  are  ranges  and/or  bearings  violates  Assumptions  3  and  4  In  such 
a  case,  in  order  for  our  formulation  to  be  applicable,  careful  modeling  is 
called  for  so  that  our  assumptions  are  valid  at  least  in  an  approximate  sense. 

On  the  other  hand,  although  Assumptions  5  and  6  are  quite  standard  in  multi¬ 
target  literature,  they  are  not  essential  to  the  development  In  this  paper. 

Recently,  an  attempt  was  made  to  relax  Assumption  5  to  deal  with  merged  measure¬ 
ments  in  [8],  We  make  these  assumptions  in  this  paper  largely  to  minimize  non- 
essential  complexity.  Another  way  to  state  the  last  assumption,  Assumption  7, 
is  that  a  data  set  is  the  smallest  unit  of  sensor  data  in  which  the  order  of 

measurements  does  not  contain  any  information  about  the  target  system  state.  For 
example,  the  measurements  from  a  radar  with  a  fixed  scanning  pattern  may  result 
in  the  order  of  measurements  containing  information  about  the  targets.  In  such 

a  case,  the  data  sets  should  be  further  divided  so  that  the  measurement  order 
does  not  contain  any  significant  information. 
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Our  general  sensor  model  is  completely  described  by  specifying  the 
detection  probability  function,  P^,  the  number-of-false-alarms  probability  func¬ 
tion,  P„  ,  and  the  measurement  value  probability  density  function  Pw.  Finally, 

nfa  m 

since  our  sensor  model  is  a  mechanism  in  which  a  random  set  generates  other 
random  sets,  we  need  one  more  assumption  on  the  target  interchangeability 
(permutability)  corresponding  to  Assumption  2: 

Assumption  8:  (Interchangeability  (2)) 

P_,  P  and  Pw  are  invariant  under  the  permutation  of  targets,  i.e., 

FA 

P  (6oir|lI(X),n,t,s),  P„  (m|60Tr, II(X)  ,n, t, s)  and  Pu(y |otoif,m,6oTT,n(X)  ,n,t,s)  are 
D  NFA  h 

all  invariant  with  respect  any  n-target  permutation  homeomorphism  H  induced 

by  any  permutation  ir  where  ir({i})«*{if(i)  }  and  0  is  the  function  composition 

operation  (fog(x)=if (g(x))  for  all  x  in  Dom(f)). 
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Tracks  and  hypotheses  are  among  the  most  frequently  used  terms  in  the 
multitarget  tracking  literature.  Often,  however,  these  are  not  precisely 
defined.  Our  definitions  of  tracks  and  hypotheses,  given  below,  closely  follow 
Morefield's  notations  in  [9]  but  differ  in  one  crucial  aspect,  namely  the 
separation  of  the  measurement-'value  information  and  the  number-of -measurements 
information  in  each  data  set. 

Let  Z  be  the  collection  of  all  the  data  sets.  Due  to  Assumption  3,  Z 
is  countable.  Without  loss  of  generality,  we  can  assume  that  Z  has  a 
one-to-one  correspondence  to  a  subset  K  of  [t  ,«)  <S  through  the  isomorhpism, 

(y,m,t,s)  - ►  (t,s)  . 

m  m 

Z  K 

Hence,  for  every  k  in  K  ,  we  can  denote  the  unique  member  (y,m,k)  in  Z  by 
Z(k).  We  may  also  call  Z(k)  data  set  k.  It  is  then  natural  to  call  K  the 
data  set  index  set.  Let  d  be  any  total  order  on  K  such  that  (t,s)  <  (t',s*) 
whenever  t  <  t'.  (k  •<  k*  if  k  <  k'  but  k  ?  k'.)  Such  an  order  may  not  be 
unique  but  its  existence  is  obvious. 

(k) 

For  each  k  in  K  ,  define  the  cumulative  data  set  Z  up  to  k  by 

Z(k)  =  U  Z(k')  (21) 

k'-<  k 
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and  the  cumulative  measurement  index  set  JM  up  to  k  by 


j(k)  =  U  j  (k')x{k’>  , 

M  k'dk  M 


where  J  is  defined  by  (H) 
M 


Due  to  Assumption  3,  we  may  treat  K  as  a  completely  deterministic  set 

(k)  (k)  (k) 

whereas  Z  and  J'  are  random.  Every  (j,t,s)  in  J„  indicates  the  j-th 
M  M 

measurement  in  a  data  set  from  sensor  s  at  time  t.  Then,  for  each  k  in  K 

(k) 

a  track  at  k  is  a  subset  of  Jw  and  a  data-to-data  association  hypothesis 
-  M  - - 

(henceforth  referred  to  simply  as  hypothesis)  at  k  is  a  (possibly  empty) 
collection  of  nonempty  track(s).  A  track  T  at  k  is  said  to  be  possible  if 


#(  ( J  (k ' ) x{ k ' }  )DT  )  <  1 
M 


for  all  k'  £  k  (Assumption  6).  Let  the  set  of  all  the  possible  tracks  at  k  be 
denoted  by  £T(k).  A  hypothesis  X  at  k  is  said  to  be  possible  if  it  is  a  subset 
of  ST(k)\(p  and  TflT'-<J>  for  all  the  pairs  (t,t*)  of  tracks  in  X  such  that  t*t' 
(Assumption  5).  Denote  the  set  of  all  the  possible  hypotheses  at  k  by  J((k) . 

Define  a  random  set,  via  the  random  function  A(k)  and  the  random  set  Ip(k) , 


A  -  f{(A(k)({i)y,k)lkeK)  |  ie  U  ID(k)} 
l  Wf-K  ) 


Its  restriction  to  k  is  defined  by 
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(25) 


A|k  =  (TfljJ10  |  T  e  A) \(4>} 

for  each  keK  .  Then  it  is  clear  that,  for  a  X  ejf(k) ,  event  {A|k  =  X}  should  be 

interpreted  as  an  incidence  in  which  (1)  there  are  #(  U  X  (lc ' ) )  — #  (X) 

k'^k  D 

targets  which  are  detected  and  included  in  at  least  one  of  the  data  sets  prior 
to  and  including  k,  (2)  each  T  in  X  corresponds  to  a  target  (which  has  been  de¬ 
tected  in  at  least  one  data  set  k'^k)  in  a  one-to-one  fashion,  (3)  (j,k')eT  means 
that  the  j-th  measurement  in  data  set  k'  originates  from  the  target  identified  by 

T#  (4)  Tn(JM(k,)x{k'})“^  means  that  the  target  is  "falsely"  dismissed  at  k' , 

(k) 

an<*  **M  *-s  set  all  the  false  alarms  up  to  k. 

Therefore,  every  X  in  Jl( k)  is  a  hypothesized  set  of  tracks  which  are  in 
turn  the  sets  of  measurement  indices  which  are  hypothesized  to  originate  from 
targets.  The  term  "hypothesis”  is  thus  suitable  for  use  in  our  formulation. 
Assumptions  5  and  6  imply 


Prob.  (A |  e  JH k)  |  j”° }  =  £  Prob.tA.  -X  |  J  <k) }  «  1  .  (26) 

|k  M  Xejfoo  |k  M 

In  other  words,  JUk)  is  the  mutually  distinct  and  collectively  exhaustive 
set  of  all  the  possible  "explanations"  of  the  origin  of  each  measurement  in  the 
data  sets  up  to  k. 


At  this  point  a  few  words  of  caution  are  in  order,  because  a  straightforward 


expansion  such  as 


Prob.{x(t)edx,NT=n|z(k)}  =  £  £  Prob.{X(t)edx|NT=n,Ajk=X,Z(k)}‘ 


X  £.#(k)  n=0 


Prob .{ NT=n|  A |  k=X  , Z (k) } Prob.{A |k=X|z {k) ) 


is  in  general  meaningless  and  Prob.{x(t)edX,Nir|Aj  k=X  ,Z  may  not  be  a  part 

of  an  appropriate  set  of  variables  which  may  constitute  a  state  of  multitarget 
tracker  (information  state),  as  we  will  see  in  the  next  section.  Nonetheless, 
our  primary  objective  is  to  evaluate  every  hypothesis  X  or  calculate  Prob.{A|k|z^k^ } 
Before  closing  this  section,  let  us  introduce  a  few  notations  which  will  be  useful 
later. 


Similar  to  (24)  ,  for  each  k  in  K  and  for  each  k'^.  k, 


Ti„.  - 


in  ffCk1)  is  called  the  restriction  of  X  (e  3(10)  to  jA  “'(or  simply  to  kf).  and 

M 


{T|k, |tgX)  \  (<f>) 


in  Jf(kf)  is  called  the  restriction  of  X  (ejf(k))  to  jA  (or  simply  to  kf).  When 

-  H 

(k  * ) 

(X',  resp.)  is  the  restriction  (to  some  j'  )  of  a  track  TeSJCk)  (a  hypothesis 

M 

Xe  J((k) ,  resp.),  x’  (X*,  resp.)  is  called  a  predecessor  of  T  (X,  resp.). 

Successors  are  defined  by  the  inverse  relation.  Then,  it  is  obvious  that,  when 

(k) 

a  cumulative  measurement  set  J  is  given,  the  set  of  all  the  hypotheses  up 

M 

to  k,  |J  and  the  set  of  all  the  tracks  up  to  k,  £T(k'), 

“k'^k 
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are  both  (partially)  ordered  with  respect  to  the  order  determined  by  the 
predecessor/successor  relation.  Both  of  these  ordered  sets  are 
arborescent,  i.e.,  the  set  of  predecessors  of  any  element  is  totally  ordered. 
For  this  reason,  approaches  similar  to  the  one  described  in  this  paper  are 
often  referred  to  as  hypothesis-tree  or  track-splitting  methods. 


IV.  Recursive  Bayesian  Evaluation  of  Hypotheses 


In  this  section,  a  recursive  Bayesian  evaluation  of  every  hypothesis  X, 

(k) 

namely,  a  recursive  calculation  of  Prob. (A | k=X | Z v  is  described.  The 
main  result  is  presented  as  Theorem  1,  the  proof  of  which  is  given  in 
Appendix  A.  The  calculation  is  made  recursively  with  respect  to  the  total 
order  <  on  K  . 


In  this  section,  the  symbol  P  will  occasionally  be  used  with  a  slight 
notational  abuse.  It  will  represent  a  conditional  probability,  a  conditional 
probability  density  function  or  a  mixture  of  both.  Using  P  in  this  way,  we 
can  write  our  basic  recursive  equation  as 


P(A(k|z{k)) 


p(z(k),A,.  |z(k,),Alv ,) 


p<zw  z"  ') 


P(A|k,|z{k,)) 


for  each  k  in  K  which  has  an  immediate  predecessor  k* .  If  we  assume  that 
P(A|k'lz  )  has  already  been  calculated,  since  the  denominator  of  the  right 
hand  side  of  (29)  is  the  normalizing  constant,  the  left  hand  side  of  (29)  is 
given  completely  by  calculating  P(Z  , A | k J Z  ,Ajk,).  Roughly  speaking,  this 

term  can  be  expanded  as 


P(ZCk),A|kjz(k,) ,A|k.)  =  2  P(NT!  A|k',z(k,)) 

NT 

fp(Z(k)  ,A|k|x(t)  ,NT,A|k,  ,z(k  ^Pjdxtt)  |NT,A|k,,z(k  ) )  (30) 


with  k=(t,s).  Therefore,  assuming  that  P  C  I ^ | k • » z  )  and  P (dX (t) |nt, A | , Z ' 

are  provided  by  recursion,  (30)  can  be  calculated  if  we  know 
(k)  .  (k'l 

P(Z  ,A| ^|X(t) ,NT,A|^, ,Z  )  which  can  >  in  fact,  be  calculated  using  the 
generic  sensor  model  described  in  Section  II. 


Before  proceeding  with  further  discussion,  we  make  a  few  preparatory 
observations.  For  each  k  in  K,  define  a  random  set, 


U  *n0f), 

k'i  k 


of  the  cumulative  index  set  of  detected  targets.  Then  the  definition  (24)  of  A 

(k) 

implies  that  # (A  j  =  # (IQ  )  with  probability  1.  Even  when  we  hypothesize 
N^=n  and  A|k=X  for  some  XeJKk)  and  some  n>#(X),  the  true  origin  in  IT  of  each 
track  T  in  X  is  still  uncertain.  This  uncertainty  can  be  modeled  by  a  random 
integer-valued  function  such  that  Prob. (DomCn^) =A| ^,Image =I^k^| A ^,1^ 

and  defined  by 


^(x)5*!  if  and  only  if  T  =  {(A(k')  ({i})  ,k')  |k'^  k}  .  (32) 


Then  Assumptions  2  and  8  imply  with  a  simple  recursive  argument  that 


Prob.{nk=uj|A|k=X,NT=n}  =  (#(W(X,n))' 


(n  -  #  (X) )  ! 


for  each  X  e  k)  ,  each  n  >  #(X)  and  each  u>eW(X,n)  where  W(X,n)  is  the  set 
of  all  the  one-to-one  functions  defined  on  X  taking  values  in  (l,...,n). 


Moreover,  the  same  set  of  assumptions  implies  that,  for  any  k=(t,s)  in  K, 

any  X  in  k)  ,  any  n  _>  #(X),  any  Ui  e  W(X,n),  any  permutation  it  on  {1, - ,n}  , 

and  any  to  e  W(X,n) ,  we  have 

Prob.{X(t)edx|ftk=to,Ajk=X,NT=n>  =  Prob.{x(t)eII(dx)  (fik=io,A|k“X,NT=n}  (34) 

if  to( t)=it(<o(t))  f or  all  t  in  X  and  IU*):jC  OC  is  the  n-target  permutation 

n  n 

homeomorphisra  induced  by  the  permutation  ir. 


Since  our  sensor  model  described  in  Section  II  is  based  on  a 
"fixed”  order  of  targets,  we  must  further  hypothesize  the  correspondence  between 
a  hypothesis  X  and  i  ;  origin  in  1^  in  order  to  calculate  (30),  For  this  reason. 


(k) 

Prob.{x(t)Edx|NT*n,A|k =  X,Z  } 

Y  Prob.  {X  (t)  EdX  | fik=oi, NT=n ,  A .  R=X , Z  (k }  }  Prob .  {^k=uJ |  NT=n , A  i  k=X ,  Z (k) } 

io£?wX,n)  ' 

■fo-  UAl  }.l  Y  Prob.{x(t)edx|fl  =w,N  =n,Ai.=X,Z(k)}  (35) 

n!  U£f(X,n)  k  T  |k 


cannot  be  a  part  of  the  information  state  to  be  propagated  to  complete  the 
recursion.  Also,  (35)  is,  in  general,  not  a  good  candidate  for  the  tracker 
output  either.  For  example,  suppose  that  n=2  and  X={t}>  Then  the  quantity, 


Prob.{x(t)edx(NT=n,A|k=X,Z^}  = 

(k)  \ 

Prob.{xi(t)£dX1,X2(t)edX2|ii>1,n,X,Z  } 

(k)  i 

+  H  Prob. (X^ (t) edX^jXj (t) edX2 |w2 ,n,X ,Z  /« 
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with  uj^(t)=1  and  u^(x)z:2  does  not  make  much  sense.  It  is  actually  an  over¬ 
aggregation  of  information. 


As  seen  in  the  subsequent  theorem  and  Appendix  A,  however,  the  following 
three  functions  can  constitute  an  information  state  of  the  tracker:  For 
each  k  in  K,  define 


P^k)  (A|z(k))  =  Prob.{A|k=X|zl,w}  , 


(n|A,Zl  ’)  «  Prob.{NT*n|A|k=X,Z1*'}  ,  and  (37) 

P^k)  (dx|u,n,A,Z(k))  =  Prob.tXftJedxl^urN^n.A^^X^^5}  ,  (38) 


for  each  AeX  ,  each  n>#(A)  (by  the  definition  (24)  of  A,  obviously 
(k)  (k) 

P..  (n|A,Z  )=0  if  n<#(A).)  and  for  some  uefWA.n).  Because  of  (34),  just  one 

— 

T 

to  is  enough  for  (38).  Again,  with  a  somewhat  notationally  abusive  usage  of 
P,  we  have  the  following  Bayesian  expansion: 


PU(k).A|k|Z"''>. 


ikiz  -y.>  -  e  z  e  -<*1  '•vyv'vv- 

“k  ‘k1  T 


P  ^k1  ^  ^  ^NT^|k‘  *z  ^  > 


In  the  first  term  of  the  right  hand  side  of  (39),  there  is  no  longer  ambiguity 
in  the  origins  of  measurements.  The  rest  of  the  terms  is  given  by  (33)  and  (37) 


The  final  form  of  our  main  result  is  stated  by  the  following  theorem: 
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Theorem  1:  Under  Assumptions  1  to  8,  for  any  k=(t,s)  in  K  with  an  immediate 

predecessor  k'^t'.s’),  when  Z(k)=(y,m,k)  is  given,  we  have 


(k‘ ) 


(A|k*  lz<1C,J)  ~  nD(X|k))! 


(x|2M\  _  _ 

Li 


E 

n=#  (X) 


(n  -  #(X|k,))l 
(n  -  #  (X) )! 


m  I 

»(k*) 

’  N 


(njX.  ,z(k  })  •  -£(y,m,n,X,k) 


(k)  (k)  i  (k1 ) 

for  each  \  in  Jf(k) ,  where  P  (Z  Z  )  is  the  normalizing  constant, 

z 

nD(X|k)  is  the  number  of  detected  targets  at  k  which  X  hypothesizes,  l.e., 


nD(X|k)  =  #  ({teX | (j ,k)CT  for  some  j})  ,  (41) 


X(y/m,n,X,k)  is  the  likelihood  of' (y,m)  given  (n,X)  at  k  and  is  defined  by 


X(y»m,n,X,k)  « 

/PM(y!°Mn,6,X,n,k)PN  (m-nQ(X |k) |6,X,n,k)  P0(6|x,n,k) 

FA 


dxjx'JP^’5  (dX’  |w'  ,n,X|k,  ,Z(k,))  (42) 


with  At=t-t',for  some  w'  e  <MX  |  k,  ,  n)  ,  6  c0(n)  and 

Ct£<J° (({i} j 6(i)=l), {l , . . . ,m})  which  are  determined  by 


■( 


0  if  i  t  Image  (w) 

6  (i)  =  (  (43) 

#(T  n({l,  —  ,m}*{k}))  if  i=u)(x)  and  x  E  X 


and 


a(w(x))  -  j  if  and  only  if  T 0 ({l, • • . ,m}*{k})  ■  {(j,k)} 

for  all  TeX,  (44) 


(40) 
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using  an  arbitrary  meMX,n)  such  that  o>(t)  =(*)' (t  |  ^, )  for  every  teX  such  that 
T|k.^  • 


Proof :  See  Appendix  A. 


The  above  theorem  does  not  state  how  to  start  the  recursion,  i.e.,  a 
formula  for  the  minimum  k  in  K.  For  such  a  k,  the  left  hand  side  of  (40)  is 
obtained  by  replacing  P ^  ^(X|k,|z^k  and  P^k  ^(n|X|k,*Z^k  on  the  right 

hand  side  of  (40)  by  1  and  Prob.{N  =n},  resp.,  and  replacing  P^k  ^  (dx' |u)' ,n,X(  ,Z^k 
in  0*2)  by  Q^(dX’)  with  At=t-tQ. 


In  order  to  complete  the  recursion,  we  need  the  updating  equations  for 
(k)  (k) 

P  and  P  .  These  equations  are  obtained  by  Bayesian  expansions  which  are 

A  N_ 

T 

very  similar  to  the  one  used  to  obtain  (40)  and  are  stated 

below  without  proof.  Under  Assumptions  1-8,  for  each  k  in  K  with 

an  immediate  predecessor  k'  and  for  each  X  e  J(( k) ,  we  have 


P^k) (dx|w,n,X,Z(k))  = 

(X (y,m,n,X,k) )  1p  (y|a,m,6,X,n,k)p^  (m-n  (X|k) |6,X,n,k)P  (6|x,n,k) 

M  N  „  D  U 

FA 


J"  F^t  (dx|  X’  )P(k<  5  (dX*  |  OJ*  ,n,Xjk,  ,Z(k  })  (45) 
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( CjJk)  (X,2(k))_1  - ^ -  X(y,m,n,A,k)P^k,)  (n|x,  ,Z(k,)) 

P(k)(n|X,Z(k))  J  T  ’  «(X))!  T  |k 

NT  |  i£n>i(X) 

(46) 

> 0  otherwise. 


where  (u),a)')  and  la, 6)  are  chosen  or  determined  in  exactly  the  same  way  as  in 
Theorem  1.  The  normalizing  constant  in  (46)  is  given  by 

„(k)  ,,  _  (k)  v*  (n  #<V»!  \  l\  ^(k’h 

C  (X*Z  )  *  / .#  “  ■  ^(y n #X *k) P  (n|X|  f  #Z  )  , 

T  n=*#  (X)  (n  -  #  (X) ) !  NT  1 


Consequently,  we  have 


P(z(k'.A|.|z(k,'.A|..»  c'k>(A,..z‘«,. 

1  1  N„(k) »  NT  |K 

M 


(k1 )  (k1 ) 

When  k  is  minimal,  P  and  P.t  in  (45)  and  (46)  should  be  replaced 

X  N 

T 

by  Qq  and  Prob.(NT*=n}  with  Thus  a  fairly  general  multitarget  tracking 

algorithm  with  the  information  state, 


^X.((P,[k)  (•  |u,n,X,Z (k) ) ,  PjJk>  (n|X,Z  (k) )  )~_^  (X)  (X  |  Z (k) )) 


\eJl  , 


has  been  completely  described.  With  this  algorithm  we  may,  at  least  theor¬ 
etically,  handle  complicated  situations  such  as  targets  moving  in  a  group. 

It  is  obvious,  however,  that  implementation  poses  a  serious  problem.  One  of 
the  difficulties  is  due  to  the  high  dimensionality  of  the  information  state 
of  the  multitarget  tracker,  which  essentially  covers  all  the  jf^'s,  possibly 
from  n*0  to  infinity.  The  likelihood  £  defined  by  (42)  must  also  be  calculated 
for  a  great  number  of  combinations  of  variables.  Thus,  even  when  we  use 
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extensive  hypothesis  reduction  (management)  techniques  (which  will  be  dis¬ 
cussed  in  Part  2) ,  further  research  may  be  necessary  for  implementing  the 
general  algorithm  developed  in  this  section. 


On  the  other  hand,  if  we  introduce  an  appropriate  set  of  independence 
(k) 

assumptions,  P'  can  be  decomposed  into  a  product  of  factors  which  can  be 

A 

shared  among  different  hypotheses.  More  importantly,  a  finite  set  of  dis¬ 
tribution  functions  may  cover  all  the  X  ' s.  The  likelihood  £  is  also 

n 

decomposed  in  a  similar  way.  Roughly  speaking,  in  such  a  case  every  evalu¬ 
ation  can  be  done  at  the  track  level  rather  than  at  the  hypothesis  level. 

This  will  be  clarified  in  the  next  section.  Actually,  almost  all  the  existing 
multitarget  literature  assumes  such  a  case.  As  discussed  in  the  subsequent 
section,  existing  multitarget  tracking  algorithms  can  thus  be  viewed  as  being 
included  in  the  general  formula  shown  in  this  section  as  a  subset. 


•'M 
.  ! 

'  .'4 

’'J 
••  \ 
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By  i.i.d.  (independent,  identically  distributed)  target  models,  we  actually 
mean  a  class  of  models  for  which  several  independence  conditions  are  assumed. 
With  such  assumptions  the  general  algorithm  shown  in  the  previous  section  can 
be  greatly  simplified  since  many  terms  can  be  reduced  to  the  products  of 
many  factors  which  can  be  shared  among  other  products. 


We  now  assume  that  the  target/sensor  model  satisfies  the  following  additional 
set  of  assumptions: 

Assumption  Al: 

c  c 

For  each  positive  integer  n,  we  have  DC ={0}  (DC  is  ignored  henceforth)  and 

n  n 

DC^xDC  >  where  X  is  a  common  hybrid  space  with  a  hybrid  measure  (Lebesgue-measure  x 
n 

Q 

counting  measure),  (i.  By  ignoring  DC  ,  the  target  system  state  space  becomes 


DCn  -  (DC)  =  DCx 

n  v 


Given  NT=n,  X*(x^)^_^  is  a  system  of  time-homogeneous,  independent  and  iden¬ 
tically  distributed  Markov  processes  on  X  with  the  common  a  priori  statistics 
defined  by  the  initial  distribution  density. 


Prob. {xi (tQ) Cdx}  *  qQ(x)y(dx) 


and  the  state  transition  probability  density, 
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Prob. {x^ (t+At) edxj (t) =x' }  =  f^t {xj x’ )p (dx) 


In  other  words,  we  have 


Q0(iSldxi)  "  1SiVXi,|i{dxi) 


(50’) 


PAt(iSldxil(xi,i-l)  =  iSi  fAt{xilxi,»J(dxi) 


(51’) 


Assumption  A2: 


The  priori  distribution  of  N^,  the  total  number  of  targets,  is  Poisson 
with  mean  vQ,  i.e. , 


Prob.{NT=n}  *  exp(-vQ)- 


Assumption  A3: 


The  event  pertaining  to  the  detection  of  a  target  i  depends  only  on  its 


state  xi#  i.e.,  for  each  k  in  K,  each  n,  each  3fn  and  each  6efl)(n) , 


n  n  .  «(D  (1  -  5(i)) 

PD(6I (xi)i=l'n'k)  =  n  PD(x  |k)  (l-pD(x.|k))  (53) 

i=l 


with  a  common  detection  probability  function  p  <  •  |  k )  :  rt*  [o,l]  . 


Assumption  A4 : 

The  number  of  false  alarms  for  each  data  set  k  is  independent  of  any 


"Vu"*  «,”•****  ■  **  \  V*  *  - 

v  v  v-'.  .  -v 


target  state  or  any  other  data  set  variable  and  has  the  distribution  p  ( • I k) , 

N_  1 
FA 


(m|6,x,n,k)  =  p^  (m[k) 
fa  fA 


for  all  (6,x,n,k).  Given  the  number  of  false  alarms  in  a  data  set  k=(t,s), 
the  values  of  the  false  alarms  are  i.i.d.  with  the  common  probability  density 


function  p  ( | k)  on 


Assumption  A5: 


The  measurement  error  in  a  measurement  which  originates  from  a  target  i  in 
any  data  set  k=(t,s)  depends  only  on  the  target  state  x^(t)  and  is  modeled  by 
a  common  transition  probability  density  function  p^t*  I  *  *k)  :QJS* DC  -*■  (o,<=)  from 
OC  to  QJ s  for  each  k.  Thus  we  have 

^ ^  j=l 1=1' n'^  -  I  ^  PfaOjI k>j  05) 

'6(i)=l  /  'i^Imaee(a)  / 


for  every  n»0,  Cx±)^=1e 6efl)(n),  m>0,  aej°({{l},..,{n}},{l,..,i})  and 
,  .  m  m 

(y3lj-lC(,/s)  • 


We  should  note  that  Assumptions  A1  -  A5  are  assumptions  which  are  "additional" 
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to  Assumptions  1-8.  For  example,  equations  (53)  through  (55)  satisfy  the 

requirement  of  Assumption  8.  First  let  us  discuss  an  important  implication 

of  the  independence  assumptions  (49)  to  (55):  For  each  k=(t,s)  in  K  and  each 

(k) 

track  T  eSI(k)  ,  we  define  the  cumulative  data  set  Z  restricted  to  track  t 


Zj •  U  YCT.k'Jxtk'} 

|T  k'^.k 


where  Y (*,•):  u  Sr  'xk  ■*•  (  U  <1/  )  U  {6}  is  defined  by 
,  „  ses  s 

keK 


Y(T,k)  = 


if  Tfl(J  (k)x{k>)  =  <(> 
M 


y.(k)  if  ( j , k) GT 

j 


for  each  (T,k)e  U  T*  *K,  where  y.(k)  is  the  j-th  measurement  in  data  set  k. 
keK  3  (k) 

The  usage  of  9  is  again  symbolic,  i.e.,  (9,k)e  Zj^  means  no  measurement  at  k 

(k) 

in  track  T.  On  the  other  hand,  (y,t,s)eZ|^  means  that  y  CQ/s  is  the  measure¬ 
ment  (value)  which  is  hypothesized  (by  T  )  to  originate  from  a  target  creating 
track  T. 


Then  consider  a  Markov  process  x  on  3C,  which  is  defined  by  qQ  and  f^., 

and  an  incomplete  observation  mechanism  which  creates  a  measurement  Y (T ,k)  if 

it  succeeds  in  creating  a  measurement  and  provides  nothing  (represented  by  0)  if 

it  fails,  according  to  p  and  p  described  in  (53)  and  (55),  i.e.,  assume  that 

D  M 


Prob.  { Y  (t  ,k)  edy  |  Y  (T  ,k)  ,x  (t)  =x}  Prob.  {Y  (t  ,k)  /9 1  x  (t)  =x)  = 

PM(y|x,k)p  (x|k)u  (dy) 
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and 


Prob. {Y (t ,k) =6 | x (t) =x)  =  1  -  p^fxjk) 


Then  the  problem  of  calculating  the  state  distribution  of  x  at  time  t  conditioned 
(k) 

by  2 j ^  (k=(t,s)  )f  can  be  solved  by  applying  standard  filtering  theory,  i.e., 

by  extrapolation  using  f  and  updating  using  (58)  or  (59) .  Let  us  denote  the 

solution  to  this  "mini"  or  "single-target"  problem  by  p^,  i.e.,  for  every  T£  U 

T  keK 

let 

(x)y(dx)  =  Prob.  (x  (t)  edx|z  }  .  (60) 


The  independence  assumptions  Al  to  AS  then  imply 


PX°  (121<Jxi|a),n,X,Z(k))  ■  Prob.{x(t)einidxi[nk=tJ,NT=n,Ajk=A,Z(k)} 


'  9  t rt  \ 


for  each  k  in  K,  each  A  e  J((k)  ,  each  n>#(A)  and  each  u)£fMA,n).  Although  (61) 
can  be  shown  without  difficulty,  it  is  yet  to  be  proven  (actually,  (61)  is 
included  in  Theorem  2  below  ) . 


Let  us  introduce  another  useful  notation:  For  each  k=(t,s)  in  K,  define 
the  track-measurement  likelihood  function  Lk:  (  u(9})  *  S(k)  (0,®)  by 


Lk(y'T)  = 


J" gk<y|x)p^k)  (x)y  (dx) 
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where  g  (•(•>:(  0/  H{0})x3f  ■+  [0 ,»)  is  defined  by 

K  S  ^ 


r  PM(y|x,k)po(x|k) 

g  (y|x)  =  J 

{  1  -  PD(*|k> 

if  yj*8 

if  y=9 

(63) 

and,  for  every 

Jk) 

t  c  2  , 

Mk)  .  .  J 

P  (x)  «  < 

Jft  t,  (xjx’)p(k  )  (x')u(dx') 

X 

if  k  has  an  immediate 
predecessor  k •  =  ( t  * , s ’ ) 

(64) 

l 

Jft  (x|x‘ )q0(x*)U(dx’ ) 

■  X  ~  0 

otherwise- 

The  main  result  of  this  section  is  shown  below: 

Theorem  2:  Under  Assumptions  1  to  8  and  Al  to  A5, 
111  For  each  k  in  K,  we  have 


Prob.{NT=n|A|k,Z(k)}  «< 


’exp(-v (k) ) 


u(k)(n  -  #(A|k}) 
(n  -  #(A|k)) ! 


if  n>#(A|k) 


otherwise 


(65) 


where  (v0t))jceK  is  given  by 


v(k’)L  (9,<p)  if  k  has  an  immediate  predecessor  k’ 
k 


u(k)  = 


(66) 


VJk(0'4’> 


if  k  is  the  minimum  in  K  , 


12)  for  each  k  in  K  which  has  an  immediate  predecessor  k’  and  each  x  i°  JUk) 


when  Z(k)*(  (y^  J^.m.k)  is  given,  we  have 
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«  i .  i  ■  ■  w 


p/,a|2(k>)  =p/'>(A|k,|z'k'>,.'Tn 


£  (T ) 

(V(k'))  '  ULk(Y(T,k)  ,T) 


((m-n  (X|k))!)p  (m-n  (A  |k)  |k)  n  Pp.tyJk) 

° _  FA  j£jFAa,mlk)  FA  3 

p^k)  (z(k)  |z(k,))‘ (in!) 'expCvCk')  (1  -  L.  (0,<J>)) 


where 


...  |,  .  r  . ,  '  ilThere  is  no  T  in  A  such  that. 

DpA(A,m|k)  =  {3e{lf...,m)|Tn({1^  . .  ,ra}X{k}J  =  {(j,k)}} 


■{ 


0  if  T,  ^<P 


lf  Tik'=4> 


whereas  n  (A  |k)=m-#  (A,m|k))  is  as  previously  defined,  and 

D  FA 

B3]  for  each  k  in  K,  we  have  (61)  for  each  A  e^f(k) ,  each  n>^#(A) 
and  each  weWIA/n),  and  moreover, 


P(k)  (x)  =  L.  (Y(T,k)  |x)p”°  (x) 

T  a  JC  i 


for  each  t  £  £T(k) 


Proof ;  See  Appendix  B. 


We  should  note  that  an  empty  track  $  is  always  included  in  3(k)  for  any 

~(k) 

k  in  K  according  to  our  definition.  Thus,  p^  '  (•)  is  the  a  priori  distribution 
density  function  (at  k)  which  is  common  to  all  the  undetected  targets  up  to  k 
(not  including  k) ,  and  p^  (•)  is  the  a  posteriori  distribution  density  function 
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(at  k)  of  targets  undetected  up  to  and  including  k.  The  definition,  (62)  and  (63), 
of  the  track-measurement  likelihood  functions  gives  us  the  following  verbal 
expression  of  (67) ;  The  posterior  probability  of  any  hypothesis  is  the 
product  of 


(1)  a  priori  probability,  P^k  ^  I  k *  I z  ^  or  the  Probability  of  the 


parent  of  X/ 


(2)  the  likelihood  of  the  set  of  measurements,  jpA(X)  >  to  be  the  false 

alarm  set,  ( (m-n  (X|k) )  I  )p  (m-n  (X|k)|k)  II  p  (y .  j  Jc)  , 

D  nfa  D  jej  a,m|k)  FA  3 

(3)  the  likelihood  Lk{Y(x»k),x)  of  measurement  Y(x,k)  (?<0)  originating 

from  a  previously  detected  target  (t  j  ^ ,  j^<{> ) 

(4)  the  likelihood  (Y(t ,k)  ,x)  of  a  previously  detected  target  (t j 
being  undetected  (Y(x,k)  =  0)  and 


(5)  the  likelihood  v(k‘ )Lk(Y(x,k) ,x)  of  a  measurement  Y(x,k)  (/&) 
originating  from  a  newly  detected  target  (x^,^) 


divided  by  the  normalizing  constant.  Likewise,  we  may  call  L^(0,<J>)  the  likeli¬ 
hood  of  an  undetected  target  remaining  undetected. 


As  seen  in  (67)  and  (70),  the  evaluation  of  hypotheses  can  be  done 

at  the  track  level  due  to  the  independence  assumptions.  When  k  is  the  minimum 

(k)  (k)  (k*) 

in  K,  the  left  hand  side  of  (67)  can  be  calculated  by  replacing  (Z  |Z  ) , 
v(k')f  e<x)  and  P^k  )  (X | j Z (k ' ) )  by  &ik) ) ,  vQ,  1  and  1,  resp.  The  initial 
condition  for  the  filtering  equation  (70)  is  already  included  in  the  definition 
(64)  of  p|k^ .  Thus  we  have  given  a  complete  description  of  the  (so  called) 
i.i.d.  multitarget  tracker,  whose  information  state  at  k  is 
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Ol) 


T  e  £TOc) 


,v(k)) 


that  is  to  be  propagated  forward. 


Although  the  algorithm  shown  in  Theorem  2  is  less  general  than  that  shown 
in  Theorem  1,  it  covers  nearly  all  the  existing  multitarget  tracking  algorithms 
as  its  subset,as  shown  in  the  next  section.  As  is  well  known,  the  cardinality 
of  Jf(k)  and  £T(k)  grows  very  rapidly.  Hence  any  inplementation  of  the  general 
algorithm  described  in  this  section  requires  further  consideration.  Such  issues 


are  discussed  in  Part  2. 


VI.  Relation  to  Existing  Results 


In  this  section,  we  use  the  i.l.d.  model;  i.e.,  we  retain 
Assumptions  Al  to  A5  in  addition  to  Assumptions  1  to  8.  These  assumptions 
are  inherent  in  most  of  the  multitarget  tracking  literature  published  thus 
far,  although  they  are  sometimes  not  stated  explicitly. 


Before  discussing  the  relation  of  the  general  algorithm  described  by 
Theorem  2  to  existing  results,  let  us  describe  a  batch-processing  version  of  the 
same  algorithm.  The  theorem  described  below  is  easily  obtained  by  applying 
(67)  repeatedly.  Hence  the  proof  is  omitted. 


Theorem  3 :  Under  Assumptions  1  to  8  and  Al  to  AS  with  the  notation, 

nFA^Ik^=^FA^’NM^  1*^  *  ^or  any  k  in  K  and  any  ^  111  we  have 


p*k)<x|z(k)>  =  c^k)  (z(k,)"1.^k)  (X)  •  n  Z^ 

teX 


(72)  N 


where 


cjk)(Z(k))  =  pjk) (Z(k))  •(  n  N  (k')!).exp(vn  -  \>(k)),  (73)  “ 

B  Z  k'ikM  ° 

*  1 

ipk)(X)  =  n  (n  (X|k')I)p  <n  (X|k')|k‘)  n  P  (y.(k’)|k*)  (74)  -j 

FA  k>i  k  fa  Nfa  FA  jejFA(X|k')  FA  3  J 

£(k)  -  V  (k  (T ) )  n  L.  ,  (Y(t  ,  k')  ,T)  «=  V  n  L  ,  (Y(X  ,k’)  ,T)  (75) 

T  k(T)ik«ik  k  k'£  k 


for  every  T  €  £7(k)  where  k(T)  is  the  minimum  of  set  (k* ex| k'  i  k,T  H  <JM (k '  )*{k’  })K4>}  , 
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when  2tk)=  U  ((y.(k'))NM(k,),N  (k*),k'). 
k'  i  k  ^  j  =  l  M 

It  is  possible  to  prove  Theorem  3  without  using  Theorem  2  and  to  deduce  (67) 

from  (72).  Theorem  3  states  that  a  posteriori  probability  of  each  hypothesis  X 

(k) 

at  k  is  the  product  of  the  track  likelihood  A  of  each  track  T  in  A  and 

- x 

the  false  alarm  likelihood  divided  by  the  normalizing  constant,  and  provides  us 

with  a  unified  view  of  what  we  may  call  "track  likelihood"  approaches  such  as  the 

algorithm  described  in  [10]  and  [11].  The  track  likelihood  updating  equation, 

(k)  (k') 

m^X\  ^(^(T.k) , T) ,  follows  immediately  from  (75)  with  k'  being  the  immediate 

k  <\,(k) 

predecessor  of  k.  When  each  p^  and  p^  are  gaussian,  L  (y,T)  (y^0)  is  the 

T  M  K 

exponent  of  the  negative  square  innovations  norm  (times  some  constant) ,  which  is 
sometimes  called  ’'scores."  (see  [12].)  As  is  well  known,  the  squared  innovations 
norm  or  its  sum  over  a  track  may  be  considered  a  random  variable. 

Let  us  make  one  more  assumption: 
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and 


^k’  ■  v0  "  A'‘’,(t'’0'T)  C75'> 

K  j  K 

with 

r  Lk(0,x)  if  y=e 

^(y,T)  “  l  (y,x)  •  <76> 

— * - -  if  y^e 

VFA(k)pFA(y|k) 

Take  the  logarithm  of  (72')  and  ignore  the  normalizing  constant.  Then  we  have 
a  function  h:  JH k)  -*■  (-<»,<»)  defined  by 

h(A)  =  Z  log(^k))  =  I  log$*k))X(T;A)  (77) 

t el  T  t  e3(k)\4>  T 

for  every  A  in  v#(k)  (X(*;A)  is  the  indicator  (characteristic  function)  of  set  A.). 

With  (77)»  we  can  interpret  Morefield's  0-1  integer  programming  algorithm 

described  in  [9],  Namely,  the  problem  of  obtaining  the  maximum  a  posteriori 

(k) 

probability  (MAP)  hypothesis  A  at  k  given  Z  is  equivalent  to  maximizing  (77) 

with  respect  to  A  in  JUk) .  In  (77),  (X(X;A))  ..  .  is  the  0-1  vector  to 

*  xeST(k)\<p 

which  the  0-1  integer  programming  is  applied.  Then  the  constraint  imposed  by 
Assumptions  5  and  6  can  be  written  in  a  0-1  matrix-vector  inequality  as  des¬ 
cribed  in  [9]. 

In  most  of  the  existing  roultitarget  tracking  literature,  in  addition  to 
Assumptions  1  to  8,  Al  to  A5  and  A4',  the  following  assumptions  are  made: 

(1)  f^t  is  defined  by  a  linear  dynamical  system  driven  by  a  white  noise, 

(2)  Pu(y|x,k)  is  defined  by  a  linear-gaussian  measurement  equation, 
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y  *  H,  x  +  v, 
k  k 


with  an  appropriate  matrix  H  and  an  independent  additive  gaussian  noise  v  ,  and 

K  k 

(3)  p  (• | k)  is  uniform  over  4/  . 

i  A  S 


In  many  cases,  the  initial  distribution  (modelled  by  qQ(*))  of  undetected  targets 

as  well  as  p^(k)  is  relatively  "uniform"  or  has  a  large  variance  when  compared 

with  the  variance  of  the  measurement  noise  v,  .  Furthermore,  if 

k 

p  (•  I  k)  is  constant  over  the  field  of  view  of  each  sensor,  p^  and  p^  with 
Tjty  may  be  reasonably  well  approximated  by  gaussian  densities  with  relatively 
small  variances  compared  to  the  size  of  the  field  of  view;  hence,  we  have 
approximately 


Lk(y,x)  - 


"d  /P M«d 


x,k)p^kl  (x)ii  (dx) 


'D 


1  -  IT, 


(27r)N/2(det(D 
if  y^e 

if  y  =6 


}N/2  exP(_>5l  lek|k'l  lr_l) 


(78) 


'V,  ~  X 

where  ek|  k,=^""Hk*k|  k'  innovati°ns»  r=Var  (v^)  +Hj^k|  ^  ,H^  is  the  innovations 

variance,  Xk|k'  an<*  ^k|k'  are  tlle  mean  an<*  variance  P^k^  ,  resp. 

( [ 1 x| | ^Ax  Ax  and  a  is  the  transpose  of  vector  or  matrix  a.),  and  ttd  is  the 

constant  value  of  pD(*|k).  When  updated  by  (70),  an  approximation  similar  to 

(78)  leads  us  to  the  usual  Kalman  filter  equation  if  y^  9  and  no  updating 
,  (k)  Mk).  ..  . 

<pt  -pt  >  lf  ye* 


Therefore,  with  all  the  additional  assumptions  described  above,  it  is  easy 
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to  see  that  (67)  becomes  Retd's  algorithm  described  in  [4]  with 

6^00  =  v(k')Lk(y,T>  (79) 

for  T  such  that  T  | ^, where  k'  is  the  immediate  predecessor  of  k.  is  called 

the  "density  of  previously  unknown  targets  that  has  been  detected"  in  {4].  More¬ 
over,  the  constant  pD(*Jk)  implies  that  NDT^^ ( { teX ) T j •  T fi(JM(k)x{k})jf<|>}) 
has  a  binomial  probability  distribution  given  NTGT4//(X|lc')  »  enabling  us  to  use 
(among  others)  to  expand  (39),  which  is  actually  done  in  [4].  On  the  other 
hand,  0^  given  by  (79)  should  be  a  function  of  k  and  y.  If  it  is  fixed  to  a  certai 
value,  newly  detected  targets  acquire  increasingly  (w.r.t.  k)  unjustifiably  high 
possibilities.  To  prevent  this  from  happening,  Reid  proposed  to  adjust  0^  as 
described  in  a  paragraph  in  (4]: 

" .  a  calculation  of  0NT>  the  density  of  new  (i.e.,  unknown)  targets, 

is  performed  whenever  a  data  set  from  a  type  1  sensor  is  received . 

0^T  depends  upon  the  number  of  times  the  area  has  been  observed  by  a  type  1 

sensor  and  possible  flux  of  undetected  targets  into  and  out  of  the  area." 

Aside  from  this  description,  there  is  no  further  discussion  of  this  calculation 
of  0NT  in  [4], 

In  contrast,  according  to  our  formulation,  0^  is  analytically  given  by  (79). 

In  the  original  report  (13]  by  Reid,  a  rather  heuristic  method  for  calculating  0^ 

is  described,  in  which  the  sensor  field  of  view  is  divided  into  many  cells  and 

the  inflow/outflow  from  cell  to  cell  of  undetected  targets  is  calculated.  As 

seen  in  the  previous  section,  however,  the  likelihood  (79)  of  a  measurement  y 

*V(k) 

originating  from  a  newly  detected  target  ts  calculated  from  v(k')  and  p^  ,  both 
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of  which  are  calculated  recursively  for  all  k's.  The  exact  calculation  of 
(k)  ^<k) 

PT  or  PT  .  and  accordingly  (79),  is,  however,  not  generally  possible  due  to 

the  nonlinearity  of  ( •  |  k.)  .  Therefore  effective  approximation  techniques  must 

(k) 

be  exploited.  With  an  appropriate  approximation  of  p^  and  (79),  we  can 
properly  consider  the  fact  that  a  newly  detected  target  would  most  likely  appear 
on  the  edges  of  sensor  fields  of  view  and  not  in  the  middle.  In  many  cases, 

f\j 

when  only  a  small  number  of  measurement  indices  are  in  a  track  T,  p^  or  p^  may 
not  be  well  approximated  by  gaussian  densities.  For  example,  consider  a  case 
where  targets  moves  in  a  1-dimensional  space,  the  priori  target  velocity  in¬ 
formation  contained  in  qg(*)  is  represented  by  a  uniform  distribution  on  a 
possible  velocity  range  and  there  is  no  velocity  measurement  (position-only  meas- 
surement).  In  such  a  case,  gaussian  approximations  of  p^  or  p^  are  very  poor. 

Appropriate  approximation  methods  are,  therefore,  called  for  In  order  to  calculate 
<v, 

p^  and  p^  ,  and  accordingly,  the  likelihood  function  L^.  As  mensioned  in  [4], 
such  approximations  coupled  with  hypothesis  management  techniques  (described  in 
Part  2)  can  be  viewed  as  so  called  "track  initiation  processes." 

On  the  other  hand,  when  a  separate  track  initiation  mechanism  is  assumed, 
such  as  in  [14],  one  of  the  most  difficult  parts  of  the  multitarget  tracking 
problem  is  removed  automatically.  Any  tracking  algorithm  with  a  separate  track 
initiator  can  be  incorporated  into  our  framework  as  follows:  First  extend  the 
set  S  of  sensors  to  S»{sy}|JS  where  Sq  is  a  track  initiator  as  a  "super"  sensor 
which  creates  a  probability-one  hypothesis,  . • • .T^>  with  Ti-(i,t0,s0> 

being  the  i-th  a  priori  track.  Then,  if  we  replace  K  by  K-KU(tg,s0),  Theorem 
2  or  3  provides  a  general  Bayesian  formula  for  cases  in  which  a  separate  track 
initiator  is  employed.  Those  cases  in  which  a  track  initiator  provides  new 
tracks  in  the  middle  as  well  as  the  beginning  of  the  tracking  may  be  similarly 
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incorporated.  Although  the  use  of  a  tracker  and  a  track  initiator  in  parallel 
can  be  handled  in  our  general  framework,  such  a  use  may  be  seen  as  a  departure 
from  a  purely  Bayesian  approach.  A  track  initiator  does  have  memory.  Unless  the 
track  initiator  does  not  share  the  data  sets  with  a  tracker,  the  correlation 
between  the  data  sets  and  the  outputs  of  the  track  initiator  cannot  be  ignored. 
Therefore,  we  must  either  divide  the  data  sets,  one  for  the  track  initiator  and 
the  other  for  the  tracker,  or  use  more  or  less  heuristic  methods  to  discount 
such  an  effect  as  double  counting  or  too  much  reliance.  For  this  reason,  we  may 
say  multitarget  trackers  used  with  separate  track  initiators  are  either 
restrictive  or  "sub-optimal." 

As  discussed  in  greater  detail  in  Part  2,  since  k)  is  the  collection  of 
mutually  distinct  and  collectively  exhaustive  hypotheses,  aggregation  or  combining 
of  hypotheses,  such  as  or  is  comPatib*e  with  our  formulation 

and  is  a  great  help  from  the  view  point  of  implementation.  However,  in  order 
to  perform  such  operations  properly,  we  must  know  the  correspondence  among 
tracks  in  hypotheses  to  be  combined.  When  we  assume  a  separate  track  initiator 
and  there  is  no  newly  detected  target,  such  correspondence  is  obvious  and  it  is 
possible  to  combine  all  the  hypotheses  so  that  there  is  always  only  one  (and 
hence  probability-one)  hypothesis  to  be  propagated  forward.  The  JPDA  (Joint 
Probabilistic  Data  Association)  method  described  in  [14]  thoroughly  exploits 
such  condition. 

One  of  the  points  which  we  have  stressed  in  the  previous  two  sections  is 
that  multitarget  tracking  requires  a  hierarchical  algorith;  the  evaluation  of 
hypotheses  is  at  the  top  and  (generally  nonlinear)  filtering  at  the  bottom. 

Thus,  the  construction  of  multitarget  trackers  in  many  different  situations 
creates  a  wide  range  of  nonlinear  filtering  problems.  The  use  of  hybrid-state 
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Markovian  models  enables  us  to  treat  a  wide  range  of  complicated  situations, 
at  least  in  principle.  Fairly  complicated  dynamics  are  occasionally  used  in 
multitarget  tracking  literature,  e.g.,  [15]  with  birth-death  processes.  To 
the  best  of  our  knowledge,  however,  there  is  still  no  satisfactory  nonlinear 
filtering  for  maneuvering  targets.  When  there  is  no  discrete-part  dynamics 
(target  classification  problems,  etc.),  the  required  filtering  is  substantially 
simple.  Particularly,  if,  in  addition,  the  continuous-part  dynamics  is  linear- 
Gaussian,  the  sum-of-Gaussian  filtering  described  in  [17]  may  be  the  most  appro¬ 
priate.  On  the  other  hand,  even  when  there  is  no  discrete-part  state,  and  all 
track  statistics  are  Gaussian,  because  of  the  huge  computational  requirement 
generally  associated  with  any  multigarget  tracker,  efforts  to  develop  filtering 
techniques  by  which  each  track-measurement  likelihood  can  be  quickly  calculated 
are  always  worthwhile.  One  of  such  efforts  is  described  in  [16]. 
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VII.  Conclusion 


According  to  our  viewpoint,  targets  and  sensors  in  a  general  multi¬ 
target  tracking  environment  are  properly  modeled  only  when  targets  are 
modeled  as  a  random-set  process  and  each  sensor  is  regarded  as  a  mechan¬ 
ism  which  maps  this  set  to  other  random  sets,  i.e.,  measurement  data 
sets.  A  very  general  target/sensor  model  has  been  defined  and  a  general 
recursive  multitarget  tracking  algorithm  has  been  derived  based  upon 
this  viewpoint  of  ours  and  Bayes*  rule.  Then  a  special  case  with  a  so- 
called  i.i.d.  (target)  model,  has  been  examined  in  more  detail,  and  a 
general  multitarget  tracking  algorithm  both  in  recursive  and  batch¬ 
processing  forms  have  been  derived.  Besides  the  generality  of  (indivi¬ 
dual)  target  dynamics  and  sensor  models,  two  previously  ignored  but 
realistically  very  important  factors  have  been  pointed  out:  (1)  state- 
dependent  probability  of  detection  and  (2)  precise  definition  of  likeli¬ 
hood  of  a  measurement  originating  from  a  newly  detected  target.  Our 
general  i.i.d.  tracking  algorithm  has  been  been  compared  with  existing 
algorithms  which  share  the  common  concepts  of  tracks  and  hypotheses.  We 
have  succeeded  in  providing  a  unified  view  of  existing  algorithms  by 
showing  that  the  general  algorithm  is  in  fact  a  generalization  of  many 
well-known  algorithms.  We  have  also  shown  that  the  general  multitrack¬ 
ing  algorithm  is  hierarchical  in  nature  and  always  contain  a  nonlinear 
(or  linear)  filtering  algorithm  as  a  sub-algorithm. 

This  paper,  Part  I,  covers  most  of  our  theoretical  developments  on 
multitarget  tracking.  Part  2  of  this  paper  will  consider  hypothesis 
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management  techniques  and  other  implementatin  issues.  The  term  "hypothesis 
management"  is  borrowed  from  the  artificial  intelligence  (AI)  terminology. 

In  our  context  it  means  a  set  of  procedures  which  keep  the  number  of  hypotheses 
and  hence,  a  multitarget  tracking  algorithm  under  control.  Due  to  the  rapid 
growth  of  the  number  of  hypotheses,  no  multitarget  tracking  algorithm  is 
implementable  without  appropriate  hypothesis  management  procedures.  At 
Advanced  Information  &  Decision  Systems  (AI&DS) ,  we  have  developed  a  system 
called  GTC  (Generalized  Tracker /Classifier)  which  implements  all  the  problem- 
independent  parts  of  the  general  (i.i.d.)  tracking  algorithm.  In  Part  2  we 
will  present  some  numerical  results  to  illustrate  the  use  of  this  system. 
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Appendix  A:  (Proof  of  Theorem  1) 


As  mentioned  in  the  early  part  of  Section  IV,  the  most  crucial  part  of 

00  i  (k ' ) 

the  proof  of  the  theorem  is  how  to  calculate  or  expand  P(Z  ,A|^|Z  ,A|k,), 

which  is  outlined  in  (30)  and  (39)  .  The  precise  meaning  of  (39)  is 


CO 

Prob.{y(k)eY,NM(k)=m,A|k=XlA|k,=Xjkl,Z(k,)}  =  £  Prob. {NT“nlA|k*“X|k 


,.ZVK 


£  Prob.tt  ,-M'lH  -nM,  -X.  ,Z(lt 

to'  eW(X|k,  ,n)  11 


£  Prob.  {y  (k)  eY,NM  (k)=m,  A  jk=X,£)k=oj|nkl  =(j'  ,NT=n,A  |  k  ■  ~x  |  k  ■  »z  ^  *) 


a)  eW(A,n) 


for  every  X  c J((k) ,  every  ro>0  and  every  measurable  set  Y  in  (  .  The  first 

and  second  terms  are  already  given  by  (37)  (part  of  recursive  assumptions) 
and  by  (33).  The  third  term  in  (Al)  is  further  expanded  as 

Prob.{y(k)eY,NM(k)=m,A|k=X,nk=w|nk,=tj' ,NT=n,A|k,=X|kt ,Z(k  = 

J’piob.  (y  (k)  EY  (k)  =m,  A  |  R=X  ,nk=u)|x  (t) =X ,ftk, =w'  'NT=n'A  |  k< =X  |  k<  'z<k  *  ) 
X n  Prob .  {  X  ( t )  edX  |  ^ ,  =tu  ’  , NT=n , A | k ,  =A  j k ,  , Z  ^ } }  .  (A2) 


The  conditional  probability  measure  on  X  in  (A2)  is  given  by  (38)  (part  of 

n 

the  recursive  assumptions),  i.e.,  we  have 
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r 

(. 


Prob.{x(t)edxlnk,*o)'  .NT=n,A|k,*X|k, 


•Z(k  )}  -  ^t(dx|xt)P5Jk')(dx,|ajl,n,X|k,fZ(k,)) 


(A3) 


with  At=t-t’ . 


On  the  other  hand,  the  integrand  in  (A2)  can  be  further  expanded  as 


Prob.{y  (k)GY,NM(k)»m,A|k»X,nk-<4)|x  (t)»X,nk,«u»'  ,NT»n,A|k,=X  |k,  ,z 

£  Prob.{FD(k).6|x(t)-X,£lk,-w',NT-n,A|k,-X|k,,z(k,)} 


(k‘ ) 


6  efl) 


Prob.  {NM{k)=m|FD(k)-6,X  (t)»X,nk,=uj'  *NT“n'A|k.*=^|Jc.  >z  *  / 

53  Prob.  {A(k)=a|NM(k)“m,FI)(k)••6,X(t)«X,nk,“a),  ,NT=n,Ajkl*X|k,  ,Z*k  *} 


azJl  ({{i}|6(i)*l},{l,.,  ,m}) 


Prob. (A|k=X |A(k)=a,NM(k)*m,F1j(k)*6,X (t)®X,flk,  =oj'  ,NT=n,A|k,«X|k,  ,zl  '} 

Prob.  {J2k=u)| A (k)=a,NM  (k)=m,Fp(k)**6,X  (t)  =X,ftk,=u)'  'Nx=n'Ajk.*X|k,  ,Z(k'^ 

(k’)} 


Prob.{y(k)eY|(2k=a),A(k)=a,NM(k)=m,F1)(k)*6,X(t)-X,nk(*=<J,,NT-n,Ajkf=xA|kI,Z 


(A4) 


Under  Assumptions  1-8,  the  first,  the  second,  the  third  and  the  sixth  terms 
are  given  by  (15),  (16),  (19)  and  (20).  The  fourth  and  the  fifth  terms  merely 
check  the  consistency  among  (X,u,u’,a).  The  fourth  term,  Prob. {A jk| . . .} , 
is  1  if 


it’eX 


U  {TfU{<a({w'(T’)}),k)}}  U 
Ik'  • 


U  {(a({ct)'  (T'))),k)}| 

,  i£lmage(u)')  / 

(A5) 


A 


•'A 


i  I  i  -  «  . 
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•  *. 


1  -..1 


and  0  otherwise.  Likewise  the  fifth  term,  Prob. {flk=oj| . . }  ,  is  1  if 


u>(T) 


<>■> 

a  unique  i  such  that 
T  -  {(a({il),k)} 


i£  hk'7'* 


otherwise 


(A6) 


and  is  0  otherwise. 

When  a  subset  {teA|ti  ,=$}  of  new  tracks  in  A  is  not  empty  and 


n-  #{T£A|T|k,5*}>}  >  #{TEA|T|k,=<{>} 


there  are  more  than  one  (a, 6)  which  satisfies  a£  jP  {{  (i)  1 6  (i)“l )» { 1  >  •  •  »m})  and  (A5)  . 
On  the  other  hand,  for  given  A  and  w,  there  exists  one  and  only  one  (a, 6)  such 
that  aeo4°({{i}|6(i)=l},{l,..,m})  and  (A6)  hold. 


Therefore,  for  any  m>0,  for  any  measurable  set  Y  in  (<?/s)m,  any  A  ej((k)  , 

[k' 


any  n>#  (A)  ,  any  x  cOC  and  any  u'  eWAi.  ,  ,n) ,  if  WE^A.n)  satisfies 
—  n 


oj(t)  =  aJ'(T|k,)  for  all  teA  such  that  T|k,^<J>  , 


(A7) 


we  have 


1 


■3 
'  1 


Prob.{y  (k)EY,NM(k)=m,A|k»A,nk«=a)|  X(t) =X,fik, =w'  'Nx=n'A|k.=X|k<  'Z<R  *  ) 

(m-nD(A[k) ) J 


PD(6|x,n,k)PN  (m-nD(A|k) |6»X,n,k> 

FA  m! 


y Vyl“ 


,m,6,X,n,k)  Mg(dy) 


(A8) 


where  n^()  ,6  and  a  are  defined  by  (41),  (43)  and  (44),  resp.  Otherwise 
the  left  hand  side  of  (A8)  is  zero.  Substitute  (A8)  into  (A2)  and  perform 
integration  using  (A3) .  Then  the  integral  (A2)  has  the  same  value  for  every 
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(u),w')  in  <?|/(A , n>  t  ,  # n)  as  long  as  (A7)  is  satisfied,  due  to  Assumptions 


2  and  8. 


Suppose  X  cjl(k)  ,  n>#(A)  and  w*  |  k,  ,n) .  Then  there  exist  (  #  (X) -#  (A  1  k ' )  ) 

combinations  to  choose  sets  of  newly  detected  target  indices  for  {teA  |  x  j k ,  =(f>} 
and,  for  each  of  such  combinations,  there  exists  (#  (A) -#  (X | k , ) )  1  isomorphisms 
from  a  chosen  target  index  set  to  {teA|Ti  ,=4>}.  Therefore, 


In  -  #<A(k,n 
\#(X)-#(X,  ,)/ 


(#(X)  -  #(X|k,))  ! 


(n  -  #(X|k,))!' 
(n  -  #  (X) ) ! 


is  the  number  of  u)'s  in  q(/{A,n)  which  satisfy  (A7)  for  fixed  XeJt(k)  ,  n>#  (X)  , 
and  u'e W(X|k, ,n) .  On  the  other  hand,  when  we  replace  k  by  k'  in  (33), 
we  have 


Prob.{flk,=0J'  jNT=n,A|kl=A|k,}  -  (#  WU  jk,  ,n) )  . 


Consequently  (Al)  is  reduced  to 


Prob.{y(k)eY,NM(k)=m,A|k=X|A|k,=X|k, ,Z  = 

®  1  1  (n  -  #(X.  ,))! 

E  prob.{N  =n|A,  ,*X,  ,Z  ') - —-1 - 

n=#  (X)  T  |K  |K  (n  -  0(A)) ! 


Prob.{y(k)EY,NM(k)=m,A|k=X,nk=LL)|nk,=u)'  ,NT=n,  A  k,  =X  |k,  ,Z  } 


(A10) 


for  any  eW(X|k,,n)  and  any  weW(X,  n)  which  satisfies  (A7)  . 
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Substitute  (A8)  with  nQ()  •  6  and  a  defined  by  (40),  (43)  and  (44),  resp.  ,  -j 

and  take  density  on  (  QJ  )m  with  respect  to  ym.  Attach  the  prior 

's  s 


a|k*lz  5  =  pr°b.{A|V(=X|V, |z(k 


k'  k' 


(k)  (k)  i  (k  * ) 

to  it  and  divide  it  by  the  normalizing  constant  P  (Z  |Z  ).  Then,  we 

z 

have  (40).  Q.E.D. 
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Appendix  B;  (Proof  of  Theorem  2) 


We  will  prove  the  three  parts  (11  -  (3l,  more  or  less  simultaneously. 
Part  flj  and  part  f3]  will  be  proved  by  recursion.  First,  let  us  assume 
that  (61)  and  (65)  hold  with  k  replaced  by  k'.  By  (61')  or  (65'),  we  mean 

(61)  or  (65)  with  k‘  instead  of  k.  For  any  X  £  J/(k) ,  any  n># (X) ,  any  m>nD(X|k) 

and  any  (y ,)*  ,6 )m,  it  follows  from  (51'),  (53)  -  (55),  (57),  (61')  and 
3  s 

(62)  -  (64)  that 


X(  (y^ )  j— 2 


3HrA(”~VX|l°lIOJ  (  £  Vya<{i»lxl>k)  •  ',FA(yj1W) 


Xn  6(i)=l 

*(  1  PDl*i^li(l1  (1*pDtxilk)) 
v  i=l 

’{£>i  l<VH),*tdx«'Wl 

'Ik'"* 


j^Image  (a) 

(l-<S(i))) 


n 


yik) 


P--  hjUte i)) 
i£Image(u)'  )y 


m 


•  PN  («n-n  (X|k)|k)  •  n  PFA(yJk) 

FA  j=l  J 

j^Image (a) 


-[t£X(Um  /  PM(,a((<»(T))>l*>k)PD(,!|WPTk>(X)'‘(dx)  ] 

L  Uk  x 

f  n  f  (1  -  pD<xJk))p^k^  (x)p(dx)J 

L  TCX^J^ 


L/a  "  PD(Xlk))^k)  (x)^(dx)] 


(n  -  #<X)) 


(n-#(A)) 


p  (m-n  (A  k)|k)  .  II  Pt.jk(y.;(kl  ’  ^  L  (Y(t,k),T)  *  L  (8,<j>) 

np«  D  ,  .  J  _  ,  *  * 

jCjFft(X,mjk)  TeX  (Bl) 


FA 
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where  J™=*{1,  . . .  ,m}*{k},  (u),u)')  in  T1/(X, n) *  W(X j k , , n)  is  an  arbitrary  pair  satisfying 
(A7),  and  nD(X|k),  j^CX.mjk),  6  and  a  are  defined  by  (41),  (68),  (43)  and  (44),  J 


According  to  Theorem  1  (Equation  (40)) »  it  follows  from  (Bl)  and  (65') 


p"°(X|z<k>> 


^  <*|k,|z  >  1*  -  n  (X|k))5 

(k)  k),”-' - 2 - PN  f"-nD{AlM|k) 

P 1  ; (2 1X1 |zl  ')  m!  FA 

2 


•  exp(-v(k'))*exp(v(k,)LkO,4>))*(v(k’))  <#(X)  "  #(A|k',)  (B2) 


(67)  follows  immediately  from  (B2),  and  hence,  part  C2J  has  been  proved. 


It  follows  from  (Bl) ,  (46)  and  (65')  that,  for  any  n># (X) , 


P^n|X,Z^) 


C™  (X.Z^^^-expf-vtk'JJ-vfk')  (#(V,'#(V|k')} 
(v(k')I,k(0,<f>))  (n  "  #(A)) 

(n  -  #  (X) )  ! 


^Wk)  Oc) 

where  C,  '  (X,Z  )  is  the  normalizing  constant.  (B3)  proves  part  I 1 J  since 
N 

T 

(B3)  holds  true  even  when  k  is  the  minimal  in  K  by  letting  X|k,=4>  an<*  v(k')®v^. 


(70)  is  an  obvious  consequence  of  the  definitions,  (60),  (62),  (63)  and 

(64),  of  p^,  L,  ,  g  and  p^,  resp.  (61)  follows  from  (45),  the  assumptions, 
X  K  K  T 

(k) 

(51'),  (53),  (55)  and  (61'),  and  the  definition  of  pk  .  Q.E.D. 
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where  J^={1,  .  • .  ,m}x{k},  (<d,u)')  in  Tt^X.n) * T(/(A | ^,  ,n)  is  an  arbitrary  pair  satisfying 

(A7)f  and  nD(X|k),  jpA(X,m|k),  6  and  a  are  defined  by  (41),  (68),  (43)  and  (44), 
resp. 


According  to  Theorem  1  (Equation  (40)) »  it  follows  from  (Bl)  and  (65*) 

that 


p"°<x|z<k))  -  *—n 


pj  (X|  |z(k  })  (m  -  n  (X |k) ) ! 


p(k)(2(k)|2(k')) 

4 


mi 


*PN  (m-nD(X|k)|k) 


FA 


T  11  .  PFA(yiik)]  T  11  L .  (Y(T,k)  ,T)1 

Lj£jFA(X,m|k)FA  3  J  k  J 


'teX 


expf-vfk’JJ'expfvtk'jL  (6,<J>) )  -  (v(k’))  (#  (X)  "#(X|k,))  (B2) 


(67)  follows  immediately  from  (B2) ,  and  hence,  part  [2J  has  been  proved. 


It  follows  from  (Bl) ,  (46)  and  (65’)  that,  for  any  n>#(X), 


T  T 

(n  -  MX)) 


(v(k')Lk<e,<j>)) 


(n  -  #(X)) ! 


(B3) 


^(k)  (k) 

where  CN  (X,Z  )  is  the  normalizing  constant.  (B3)  proves  part  [1]  since 
T 

(B3)  holds  true  even  when  k  is  the  minimal  in  K  by  letting  Xi .,=<P  and  v(k')=v  . 

|  K  w 

(70)  is  an  obvious  consequence  of  the  definitions,  (60),  (62),  (63)  and 
(64),  of  p*k* ,  L  ,  g  and  p*k*,  resp.  (61)  follows  from  (45),  the  assumptions, 

X  K  K  X 

(51 ' ) ,  (53),  (55)  and  (61'),  and  the  definition  of  p^k) . 


Q.E.D. 


where  j”={l, . . . ,m)x{k},  (w,w')  in  WCX,n)xcH/(X| ,n)  is  an  arbitrary  pair  satisfying 
(A7),  and  n  (A|k),  j  (X,m|k),  6  and  a  are  defined  by  (41),  (68),  (43)  and  (44), 
resp.  • 

According  to  Theorem  1  (Equation  (40))/  it  follows  from  (Bl)  and  (65')  .  ■ 

that  ; 


pik,(X|2(k), 


)  (A|k*lz(k  ))  (m  "  nD{Xlk))1 

(k)  (k)  j  _ (k * )  * 


pz' (Z 


Zl"  ')  ml 


PN  (m-nD(X|k) |k) 
FA 


T  n  ,  pFA(y,l>o]  *F  n  l  (Y(i,k),T)l 

Ljej„a,m|k)FA  3  J  i  k  J 

r  A 


■teX 


•  expt-vfk'JJ'expfvCk’jL^te/^JJMvCk’))  (#(X)  "  #(X|k*))  (B2) 


,4 


(67)  follows  immediately  from  (B2) ,  and  hence,  part  [2j  has  been  proved. 


It  follows  from  (Bl) ,  (46)  and  (65')  that,  for  any  n>#(A), 


-.1 

.  I 


,4 


-  (A .2 , -1.exp  C-v tR- > ) -v (K- , <V* -*  <V| > 


(v(k*  )Lk(6,«|>)  ) 


(n  -  #  (X) ) 


(n  -  #  (X) )  ! 


(B3) 


where  C^(A,Z^)  is  the  normalizing  constant.  (B3)  proves  part  [1]  since 
N 

T 

(B3)  holds  true  even  when  k  is  the  minimal  in  K  by  letting  Xl  ,=$  and  v(k*)=vo. 


(70)  is  an  obvious  consequence  of  the  definitions,  (60),  (62),  (63)  and 

(64),  of  p(k),  L  ,  g  and  p(k),  resp.  (61)  follows  from  (45),  the  assumptions, 
X  K  K  T 

(k) 

(51'),  (53),  (55)  and  (61'),  and  the  definition  of  p^  Q.E.D. 


A- 58 


APPENDIX  B 


DISTRIBUTED  ESTIMATION  IN  NETWORKS 


C.Y.  Chong,  E.  Tse  and  S.  Mori 


ABSTRACT 


In  this  paper,  we  consider  the  distributed  estimation  problem  by  a 
set  of  agents  connected  by  an  arbitrary  communication  network.  The 
agents  communicate  conditional  probabilities  of  the  random  state  over 
the  network.  From  these  conditional  probabilities,  each  agent  then 
tries  to  re-construct  the  conditional  probability  given  all  the  measure¬ 
ments  if  these  were  communicated  instead  of  the  probabilities.  It  is 
discovered  that  in  general  the  agents  have  to  remember  some  of  the  past 
conditional  probabilities  and  may  even  have  to  request  additional  infor¬ 
mation.  A  method  for  generating  the  fusion  algorithm  for  each  agent 
based  on  the  network  structure  is  presented  and  applied  to  some  exam¬ 
ples.  The  results  are  applicable  to  both  dynamic  and  static  states. 
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1 .  INTRODUCTION 


The  traditional  approach  to  estimation  has  been  centralized.  Even 
though  the  measurements  are  generated  by  a  large  number  of  sensors,  it 
is  usually  assumed  that  they  are  sent  to  a  central  site  where  processing 
is  carried  out  by  one  agent  (computer).  In  this  context  centralized 
estimation  theory  is  well  developed  and  has  found  applications  in  many 
real  world  problems. 

In  recent  years,  there  has  been  growing  interest  in  distributed 
estimation  problems.  In  such  problems  (Figure  1),  the  sensor  measure¬ 
ments  are  not  all  transmitted  to  a  central  processor.  Instead,  a  set  of 
local  processors,  which  we  call  estimation  agents,  are  present.  The 
agents  are  connected  by  a  communication  network.  Each  agent  collects 
the  measurements  from  a  subset  of  the  sensors,  performs  some  local  pro¬ 
cessing,  and  communicates  the  results  with  other  agents. 

The  advantages  of  such  a  distributed  estimation  system  are  many. 

It  is  more  reliable  (or  less  vulnerable)  since  there  is  not  a  single 
central  site  which  is  responsible  for  the  proper  functioning  of  the  sys¬ 
tem.  Communication  is  cheaper  since  only  the  results  of  processing,  and 
not  the  raw  data,  are  communicated.  Furthermore,  each  distributed  agent 
has  the  use  of  the  processed  data  locally  and  does  not  have  to  wait  for 
communication  from  the  central  processor.  From  a  technological  point  of 
view,  such  distributed  systems  are  made  possible  by  the  availability  of 


COMMUNICATION  NETWORK 


EXTERNAL  ENVIRONMENT 


cheap  computing  hardware.  These  advantages  make  distributed  estimation 
systems  extremely  attractive  for  many  military  and  civilian  applica¬ 
tions.  One  such  application  is  the  distributed  sensor  network  [1],  [2] 
for  tracking  and  surveillance. 

Research  in  distributed  estimation  has  progressed  along  several 
directions.  A  team-theoretic  approach  has  been  taken  by  Barta  [3]  for 
decentralized  linear  estimation  and  by  Tenney  and  Sandell  [A]  for  dis¬ 
tributed  detection.  Extensions  of  this  work  in  detection  have  been  made 
by  Teneketzis  [5]  and  Ekchian  and  Tenney  [6].  Another  approach,  based 
on  finding  constrained  decentralized  filters,  has  been  taken  by  Tacker 
and  Sanders  [7].  The  approach  of  fusion  or  combining  of  local  estimates 
to  recover  the  globally  optimal  estimate  has  been  used  in  [8]  to  [12]. 
The  linear  problem  was  considered  by  Speyer  [8],  Chong  [9],  Willsky  et 
al.  [10]  and  Levy  et  al.  [11]  while  Castanon  and  Teneketzis  [12]  con¬ 
sidered  the  nonlinear  extension.  In  all  of  the  above  [8J— [11] ,  the  sys¬ 
tem  structure  is  hierarchical  with  no  feedback  communication  or  coordi¬ 
nation  from  the  fusion  agent.  Similar  problems  of  this  type  have  also 
been  considered  in  the  management  science  literature  [13]. 

The  network  aspect  in  the  distributed  estimation  problem  has  been 
the  emphasis  in  [14],  [15]  and  discussed  in  [2].  Borkar  and  Varaiya 
[14]  presented  results  on  the  asymptotic  agreement  among  agents  for 
estimation  while  Tsitsiklis  and  Athans  [15]  considered  asymptotic  agree¬ 
ment  for  more  general  decision  problems.  It  has  been  demonstrated  in 
[2]  via  an  example  that  agreement  may  not  be  desirable  since  the  common 


conclusion  may  be  wrong. 


In  this  paper,  we  elaborate  the  results  obtained  in  [2],  The  phi¬ 
losophy  of  fusion  or  combining  of  local  conditional  probabilities  to 
obtain  the  probability  conditioned  on  all  available  information  is  again 
used.  However,  arbitrary  network  structures  are  considered  explicitly. 
They  may  be  hierarchical  with  or  without  feedback  from  the  higher  level 
or  fully  distributed.  The  presentation  is  at  a  fairly  elementary  level 
to  simplify  the  notation  but  can  be  made  more  sophisticated  if  desired 
by  introducing  sigma  fields.  The  results  may  provide  the  theoretical 
basis  for  the  analysis  and  design  of  systems  such  as  the  distributed 
sensor  network. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  2,  we 
present  the  model  to  be  used  for  distributed  estimation.  Section  3 
describes  the  distributed  estimation  problem.  Section  4  describes  the 
basic  results  for  static  random  states.  A  method  for  generating  the 
fusion  formula  for  arbitrary  networks  is  given.  The  fusion  algorithms 
for  some  examples  are  also  described.  Section  3  extends  the  basic 
results  to  the  case  of  dynamic  random  states.  Section  6  is  the  conclu¬ 
sion. 
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2.  MODEL  FOR  DISTRIBUTED  ESTIMATION 


2.1  STATE  AND  OBSERVATION  MODELS 

We  consider  the  estimation  of  a  random  process  x(t),  t  €  T  where  T 
=  [tfli  °°)  and  x(t)  €  X.  The  random  process  x( . )  can  be  static,  deter¬ 
ministic  or  a  general  Markov  process.  We  assume  the  statistics  which 
specify  the  random  process  completely  are  known. 

Let  S  be  a  finite  set  of  sensors.  At  a  given  time  t  in  T,  a  sensor 

s  generates  an  output  or  measurement  z  in  the  measurement  space  Z  .  The 

s 

triple  (z,t,s)  is  then  called  a  data  set  and  (t,s)  is  the  data  set 
index.  Let  Z.  be  the  set  of  all  data  sets  and  K  be  the  set  of  all  data 
set  indices.  If  we  assume  that  each  sensor  can  produce  only  a  finite 
number  of  outputs  in  any  finite  time  interval,  the  sets  Z  and  K  are  at 
most  countable.  Furthermore,  for  each  t  T,  the  restrictions 

zjt  =  {(z,t',s)  e  Z|  t'  <  t>  (2.1) 

and 

K)t  =  {(t'  ,s)  €  Kl  t'  <  t>  (2.2) 

are  both  finite. 
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We  make  two  additional  assumptions: 


1.  The  sensor  origin  and  time  of  each  data  set  are  known,  i.e.,  for 
any  data  set  (z,t,s)  €  Z,  t  and  s  are  known  quantities. 


2.  The  measurements  are  all  conditionally  independent  given  the  state 
process,  i.e.,  for  any  finite  subset  <(z^  ,  t  ^ ,  s^)  , . . . ,  (z^t^s^)) 
of  Z, 


k 

Prob.(  n  {z-  €  dz • } |x(t, ) , . . . ,  x(t.  )) 
i=l  1  1  1  K 


k 

*  II  Prob.  (z .  edz-lx(t-))  (2.3) 

i=l 


With  the  second  assumption,  the  observation  process  can  be  charac¬ 
terized  completely  by  the  transition  probabilities  (or  probability  den¬ 
sities)  from  X  to  Z  . 

s 


2.2  DATA  BASES 

We  are  interested  in  estimation  of  the  process  by  a  network  of 
agents.  At  any  time  t,  due  to  communication  constraints,  each  agent  may 
not  have  access  to  all  available  data  sets.  In  general,  an  agent  will 
have  only  a  subset  of  the  available  Z|t  at  t,  corresponding  to  only  a 
subset  of  K|t>  A  data  base  Z  at  time  t  is  a  subset  of  Z|t  and  a  data 
index  base  K  at  time  t  is  a  subset  of  K|t.  According  to  this  defini¬ 
tion,  Z,  (K |  )  is  the  maximum  data  (index)  base  at  t  and  <p  (the  empty 


m 


i 


v  ; 


a 


h;  ■ 


O 


y.m- 


set)  is  the  minimum.  Given  any  data  base 

z  ~  ^zi,tj,s^) . ^zk’tk,sk^’  c^e  corresponding  data  index  base 

^  =  » si^  » •  •  • »  (tk,sk)  ^  *-s  ^ound  by  the  operati< 


:ion 


K  =  In(Z) 


(2.4) 


where  the  definition  of  In  is  obvious  and  the  actual  measurements 
(Zj,...,zk)  are  founa  by 


(z1,...,zk)  =  My(Z). 


(2.5) 


When  Z  -  4>,  In(4> )  =  0,  and  Mv(4>)  *  0  where  0  is  a  symbol  representing 
"no  information". 

For  each  data  index  base  K  =  <( tj , Sl ),..., (tk, sR) }  with  correspond¬ 
ing  data  base  Z  {( , tj , s j ),..., (zk, tk, sk) }  we  define  the  conditional 
probability  P ( .  I Z )  to  mean  P(.|Mv(Z),K)  . 

All  the  definitions  above  can  be  given  more  rigorously  in  terms  of 
sigma  algebras.  This  will  not  be  attempted  in  this  paper  so  as  to  sim¬ 
plify  the  development. 


2.3  COMMUNICATION  MODEL 


We  assume  there  is  a  finite  set  N  of  estimation  agents.  Each  agent 
n  has  its  own  set  of  sensors,  i.e.,  a  subset  Sr  of  S.  Furthermore,  the 
sensor  sets  are  disjoint  for  different  agents,  i.e.,  Sr  O  Sn,  =  $  for 
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1 


•4 


*1 


is 


la. 


Ml. 


n  4  n' .  Each  agent  n  also  receives  information  from  other  agents  via 
communication.  Communication  among  agents  is  specified  by  the  known 
communication  schedule  C  which  is  a  subset  of  T  x  N  x  N.  (t.n^^)  €  C 
means  that  agent  n^  transmits  some  messages  to  agent  n2  at  time  t.  The 
exact  form  of  the  messages  will  be  discussed  later. 

Just  as  in  the  data  set  index  set,  we  assume  the  communication  fre¬ 
quency  cannot  be  infinite,  so  that,  for  any  t  €  T,  the  communication 
schedule  up  to  t, 

C,t  =  {(t',n1,n2)  €  C  1 1 '  <_  t>  (2.6) 

is  finite. 
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3.  DISTRIBUTED  ESTIMATION  PROBLEM 


3.1  IHFORMATIOH  GRAPH 

The  distributed  estimation  system  (N,  S,  C)  thus  consists  of  the 
sensor  set  S  and  the  estimation  agent  set  N  together  with  the  communica 
tion  schedules.  Four  types  of  events  affect  the  change  of  information 
in  the  system.  These  events,  the  times  when  they  occur  and  the  nodes 
(sensors  or  estimation  agents)  which  are  affected,  are  given  below: 

-  sensor  observation:  K, 

-  reception  of  sensor  data  by  an  estimation  agent: 

{(t,n)  e  T  x  N I (t , s)  C  K,  s  e  sn>, 

-  transmission  by  an  estimation  agent: 

{(t,n)  e  T  x  N|(t,n,n')  €  C}, 

-  reception  of  transmission  by  an  estimation  agent: 

{(t,n)  C  T  x  N|(t,n',n)  €  £}. 

Consider  a  subset  I  of  T  x  (SUN)  which  is  the  union  of  all  the 
sets  defined  above.  Define  an  anti-symmetric  and  transitive  binary 
relation  (or  partial  ordering)  ■<  on  I  such  that 
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i.  For  each  (n,t,t')  £  N  x  T  x  T,  (t,n)  €  I.,  (t' ,n)  £  I 
and  t  <  t'  implies  chat 

(t,n)  (t' ,n) ; 

ii.  (t,s)  €  K,  s  £  and  (t,n)  £  X  implies  that 
(t , s)  -<  (t ,n) ; 

iii.  (t,n,n')  £  C  implies  that 
(t,n)  -<  (t,n' ) . 

This  binary  relation  or  partial  order  on  X  thus  satisfies  all  the 
constraints  associated  with  perfect  communication  as  defined  by  £  as 
well  as  perfect  memory  at  each  processing  node.  (i,<)  characterizes  the 
information  flow  in  the  system  and  is  called  the  information  graph.  If 
all  the  sensor  measurements  (data  sets)  can  be  communicated  perfectly 
through  the  communication  network,  the  data  base  Z(t,i)  for  each  node 
(t,i)  in  the  graph  (X,  ■<  )  can  be  defined  by  beginning  with  the  minimal 
elements  and  following  the  rules  shown  below: 

i.  If  (t,i)  is  a  receiving  node, 

Z(t,i)  =  <Z(s, j) I (s, j)  ->  (t,i)>; 

ii.  If  (t,i)  is  a  transmitting  node, 

Z(s,j)  if  (s,j)  ->  (t,i) 

Z(t,i)  -  * 

{(z(k),k)>  if  (t , i)  -  k  €  K 
_  <j>  otherwise. 

In  the  above  (s,j)  ->  (t,i)  means  that  (s,j)  is  an  immediate  predecessor 
of  (t,i)  and  (z(k),k)  €  Z  is  the  unique  element  whose  second  component 
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IS 


With  this  construction  of  the  data  base,  we  see  that  (t,i)-<  (s,j) 
if  and  only  if  Z(t,i)  C  Z(s,j).  Similar  remarks  can  be  made  for  the 
data  index  base  K(t,i). 


Since  there  is  a  natural  direction  (along  increasing  time)  in  the 
graph,  the  arrowheads  on  the  edges  in  a  pictorial  representation  of  the 
graph  can  be  omitted.  We  would  also  omit  those  edges  which  are  due  to 
transitivity.  From  the  graph,  the  flow  of  information  in  the  system 
becomes  very  obvious.  A  node  (t,i)  is  a  parent  of  (s,j)  if  information 
flows  from  node  i  at  time  t  to  node  j  at  time  s.  Note  that  in  the 
information  graph,  the  receiving  nodes  correspond  to  the  events  when 
estimates  have  to  be  updated  with  the  arrival  of  new  information.  For 
many  applications,  it  is  sufficient  to  use  a  reduced  information  graph, 
which  is  obtained  by  considering  only  these  receiving  nodes. 

Several  examples  of  distributed  estimation  networks  and  their 
information  graphs  are  shown  below. 

Example  1.:  (Fusion  Without  Coordination) 


Of  the  agents  in  N,  agent  1  is  a  fusion  agent  and  the  rest  are 
local  agents.  The  local  agents  transmit  to  the  fusion  agent  after  they 
receive  the  data  from  the  sensors  and  perform  local  processing.  Figure 
2  shows  the  structure  of  the  system  (for  three  agents)  and  the  informa¬ 
tion  graphs.  In  this  case 
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N  =  {1,  2,  3} 


C  =  U  {(s  • , 2, 1 ) , (s  • ,3,1)> 
i=l  1  1 

where  {s^  <  Sj  <  s2  <  •••}  are  the  communication  times,  and 
■(tQ  <  tj  <  tn  K  •••)  are  the  sensor  observation  times. 

Example  2:  (Fusion  With  Coordination) 

This  is  similar  to  Example  1  except  that  right  after  fusion.  Agent 
1  communicates  with  the  local  agents  again.  This  structure  is  also 
equivalent  to  a  broadcast  system  where  all  agents  communicate  with  each 
other.  For  N  *  {1,  2,  3),  the  communication  schedule  is  given  by 

OO 

C.  =  U  u  {(si,n1,n2)> 
i“l  ni + n2 

where  s^  <  s^  <  s2  <  .  Figure  3  shows  the  structure  of  the  system 

and  the  information  graphs. 

Example  3.:  (Cyclic  Communication) 

This  is  the  example  considered  in  [2].  The  agents  are  arranged  in 
a  circle  as  in  Figure  4.  Each  agent  transmits  only  to  its  immediate 
neighbor  in  a  cyclic  manner  at  the  specified  communication  times.  Fig¬ 
ure  4  shows  the  example  for  N  =  {1,  2,  3} 

CO 

£  -  U  {(si,l,3),  (S£,3,2),  (Si,2,l)> 
i=l 
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Figure  2.  Fusion  Without  Coordination 
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with  Sq  <  s ^  <  S£  <  . 

Example  4:  (Multipath  Pattern) 

The  agents  are  arranged  as  in  Figure  5.  Agent  1  can  only  get 
information  from  Agent  4  via  Agents  2  and  3.  For  N  =  {1,2, 3, 4}  and 

CO 

c  =  u  {( s^j 2, 1 ) ,  (si,3,l),  (Si,4,2),  (Si,4,3)>, 
i=l  A 

the  information  graphs  are  given  in  Figure  5. 

3.2  DISTRIBUTED  FUSION  PROBLEM 

The  problem  is  to  compute  p(x(t) IZ(t, i) )  for  each  node  (t,i)  £  I.  in 
the  graph  (I.,  ) .  Since  the  conditional  probabilities  or  any  estimates 

are  updated  only  at  the  receiving  nodes  (extrapolation  is  carried  out  at 
the  other  nodes),  we  need  only  to  consider  the  computations  at  the  fol¬ 
lowing  two  types  of  nodes  in  the  reduced  information  graph: 

-  a  sensor  data  reception  node, 

-  a  communication  reception  node. 


At  a  sensor  data  reception  node  (t,i),  computation  of 
p(x(t) |Z(t , i) )  is  straightforward.  The  standard  Bayesian  update  formula 
would  suffice.  At  a  communication  reception  node,  the  objective  is  to 
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reconstruct  p(x(t ) |Z( t , i) )  from  the  conditional  probabilities 
{p(x(t)  |Z(s,  j))  I  (s,  j)  (t,i)}.  This  problem  is  the  distributed  fusion 
problem:  construction  of  the  conditional  probability  given  all  the  data 
sets  which  would  have  been  communicated  through  the  network  using  only 
the  conditional  probabilities  available  at  the  predecessor  nodes  in  the 
information  graph. 


4.  STATIC  RESULTS 


In  this  section  we  develop  the  main  results  for  fusion  for  each 
agent  i,  assuming  the  random  process  is  static,  i.e.,  x(t)  =  x  for  all 
t.  Since  the  information  from  different  agents  may  overlap,  care  has  to 
be  taken  when  the  conditional  probabilities  from  different  agents  are 
combined.  In  particular,  any  redundant  information  has  to  be  identified 
so  that  it  is  not  used  more  than  once.  The  following  lemmas  provide  the 
mechanism  for  doing  this.  In  the  following  x  denotes  a  random  vector 
with  prior  probability  p(x)  and  Z  is  the  set  of  all  data  sets. 


4.1  BASIC  RESULTS 


Lemma  1 


Suppose  Z^  and  Z2  are  data  bases  at  two  information  nodes  1  and  2. 

Then 


p(x|Z1  U  Z2)  =  C 


pCxIZj)  p(x|Z2) 

p(xlz1  n  z2) 


(4.1) 


where  C  is  a  normalization  constant. 

Proof 


'■  ."i 

•'.I 

■-'.1 

r'.i 
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In  the  following  A^B  denotes  the  difference  of  two  sets.  By  Bayes' 


rule, 


p(x|Z1  U  Z2)  = 


p(Zj  U  Z2lx)  p(x) 

p(zx  u  z2)  ' 


Since  Z^  U  Z2  can  be  written  as 


Zj  u  Z2  =  (Zj  \  z2)  u  (z2  \  zx)  u  (z:  n  z2), 


(4.2) 


(4.3) 


where  the  three  disjoint  data  bases  are  conditionally  independent  given 

x. 


p(x|Zx  U  Z2) 

p(Zj  \  Z2  lx)  p(Z2  \  Zj^lx)  p(Zj  n  Z2lx)  p(x) 
=  p(zx  U  z2) 


(4.4) 


But 

p(zAix)  =  p(Zj  \  z2 lx)  p(Zj^  n  z2ix), 

p(z2lx)  =  p(Z2  ^  Zjlx)  p(Zx  n  Z2lx).  (4.5) 

Thus, 

p(Zjlx)  p(Z2lx)  p(x) 

P(x lzi  u  V  =  p(z1“n  z2lx)  p(Zj  u  z2r  (4,6) 

which  reduces  to  (4.1). 
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This  lemma  states  that  since  pCxlZj)  and  p(x|Z2>  both  include 
information  contained  in  the  data  base  Z^  Pi  z2,  this  common  information 
has  to  be  removed  so  that  it  does  not  get  double  counted.  Lemma  1  plays 
a  central  role  in  distributed  estimation  theory  similar  to  the  usual 
Bayes'  rule  in  centralized  estimation  theory. 

Lemma  2,  which  is  a  special  case  of  Lemma  1,  is  also  quite  useful. 
Lemma  1_ 

Suppose  Z1  n  Z2  =  <!>.  Then 

p(x|Z.)  p(x|Z,) 

P(x|Z1UZ2)  =  C  - - - — .  (4.7) 


When  the  conditional  probabilities  from  multiple  agents  are  com¬ 
bined,  the  fusion  formula  can  be  obtained  by  repeated  applications  of 
Lemma  2.  The  following  gives  the  results  for  three  agents. 

Lemma  3 


Suppose  Z^,  Z2  and  Z^  are  data  bases  at  the  information  nodes  1,  2 
and  3 .  Then 


pUIZj  u  Z2  u  Z3)  =  c 


pCxIZj  u  Z2)  p(x|Z3) 
p(xi(z1  u  z2)  n  z3) 


p(xlZ|)  p(xlz2)  p(x|z3)  p(x|Zj  nz2n  z^) 
— p(x|zL  n  z2)  P(xiz2  n  z3)  P(xiz3  n  zx)  • 


(4.8) 
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This  lemma  again  has  a  very  intuitive  explanation.  The  terms  in 
the  denominator  consist  of  pairwise  redundant  information  to  be  removed. 
When  these  are  removed,  all  information  which  is  common  to 
Zv  %2>  and  Z,  is  also  removed.  This  then  has  to  be  restored. 

If  all  the  random  elements  involved  are  Gaussian,  the  lemmas  above 
can  be  simplified  so  that  only  the  conditional  means  and  covariances  are 
involved.  Suppose  x  is  Gaussian  with  mean  m  and  covariance  P(0).  Let 
x(Y)  and  P(Y)  be  the  mean  and  covariance  corresponding  to  the  condi¬ 
tional  density  p(x|Y).  Then  lemma  1  becomes 

Lemma  1A 

p(zx  u  z2)_1  =  p(z1)_1  +  p(z2)_1  -  p(zxn  z2)_1  (4.9) 

and 

p(z1u  z2)_1  x(zx  u  z2) 

=  p(z1)_1x(z1)  +  p(z2)_1x(z2)  -  p(zin  z2)"1x(z1  n  z2).  (4.10) 

Lemma  2  and  Lemma  3  can  be  simplified  in  a  similar  way.  Lemma  1A 
is  identical  to  that  used  in  [9]  for  deriving  the  optimal  algorithms  for 
combining  estimates  of  linear  Gaussian  systems. 

We  now  state  the  static  fusion  problem  for  each  agent  assuming  that 
x(t)  *  x  for  all  t.  The  problem  is  stated  for  the  case  when  messages 
are  received  from  only  one  agent.  But  the  extension  to  multiple  agents 


is  obvious. 


Static  Fusion  Problem 

Suppose  agent  i  receives  a  message  from  agent  j  at  time  s  in  the 
form  of  a  conditional  probability  p(x|Z(s,j)).  Let  (t,i)  be  the  immedi¬ 
ate  predecessor  to  (s,i)  for  agent  i.  Agent  i's  data  base  then  changes 
from  Z(t,i)  to 

Z(s , i)  =  Z(t , i)  U  Z(r, j).  (4.11) 

where  (r,j)  is  the  immediate  predecessor  to  (s,j)  for  agent  j.  The 
objective  is  to  find  p(xiZ(s,i))  in  terms  of  p(x|Z(t,i)),  p(x|Z(r,j)) 
and  possibly  other  conditional  probabilities  defined  on  the  information 
graph,  i.e.,  {p(x |Z(t' , i' ) ) I (t' , i' )  -<  (s,i)>. 


We  do  not  specify  a  priori  which  conditional  probabilities  are 
involved  except  they  have  to  be  conditional  on  some  data  base  Z.  defined 
on  the  information  graph  and  that  they  should  be  available  through  com¬ 
munication.  The  following  recursive  algorithm  allows  us  to  find  the  set 
of  needed  conditional  probabilities  and  how  they  should  be  combined. 


The  algorithm  consists  of  repeated  applications  of  the  following  steps. 


If  Z(t,i)  nz(r.j)  is  the  data  base  for  some  node  in  the  information 
graph,  i.e.,  Z(t,i)  HZ(r,j)  =  Z(q,k)  for  some  (q,k)  in  I.  or  if  it  is 
empty,  then  the  algorithm  terminates.  If  not,  Step  2  is  used.  In  terms 
of  the  information  graph  representation  introduced  in  Section  3,  this 
step  is  particularly  simple.  We  start  from  two  information  nodes  (t,i) 
and  (r,j)  whose  conditional  probabilities  are  to  be  combined.  Z(t,i) 
Z(r,j)  corresponds  to  the  information  of  all  those  nodes  which  are 
parents  of  both  (t,i)  and  (r,j). 

Step  2 


Let  {(t^.kj),  ^2^2),...}  be  the  set  of  common  predecessors  of 
(t,i)  and  (r,j)  in  the  information  graph.  Then 

Z(t,i)  nz(r,j)  =  ZUj.kj)  UZ(t2,k2)U  ...  (4.13) 

Step  1  can  now  be  repeated  with  the  help  of  Lemma  1  (and  its  multiple 
agent  version)  to  express  p(x|Z(t,i)  H  Z(r,j))  in  terms  of  the  condi¬ 
tional  probabilities  p(x|Z(ti,ki) ) ,  i  =  1,  2 .  and 

p(x|Z(t^,k^)  n  Z(tj,kj)),  i  =  1,  2,...,  j  =  1,  2,  ...,  etc.  The  algo¬ 
rithm  terminates  when  all  the  conditional  probabilities  are  defined  on 
nodes  in  the  information  graph  or  coincide  with  the  a  priori  distribu¬ 


tions. 


By  applying  this  algorithm,  p(x|Z(t,i)  UZ(r,j))  can  be  expressed 
in  terms  of  products  and  ratios  of  conditional  probabilities  defined  on 
information  nodes.  Each  product  corresponds  to  the  fusion  or  combining 
of  information  whereas  each  division  corresponds  to  the  removal  of 
redundant  information.  Note  that  in  general  it  is  not  sufficient  to  use 
only  the  conditional  probabilities  p(x|Z(t,i))  and  p(x|Z(r,j))  unless 
Z(t,i)  and  Z(r,j)  happen  to  be  disjoint  or  there  is  a  node  (s,k)  such 
that  Z(s,k)  =  Z(t,i)  fl  Z(r,j).  Additional  conditional  probabilities 
from  the  past  are  also  needed  so  that  the  redundant  information  in 
Z(t,i)  and  Z(r,j)  can  be  identified  and  removed.  Two  cases  are  possi¬ 
ble. 

Case  1_:  The  additional  conditional  probabilities  are  all  available  to 
agent  i,  i.e.,  they  are  either  generated  locally  from  measurements,  or 
they  are  received  from  other  agents. 

Case  2^  The  additional  conditional  probabilities  may  not  be  available 
to  agent  i.  In  this  case,  additional  communication  may  be  added.  How¬ 
ever,  from  the  algorithm,  it  can  be  seen  that  existing  communication 
paths  are  available  to  pass  along  these  conditional  probabilities. 

We  have  thus  solved  the  fusion  problem  for  each  agent  in  a  distri¬ 
buted  estimation  network.  This  algorithm  also  provides  us  with  the  set 
of  conditional  probabilities  which  needs  to  be  stored  at  each  agent  plus 
the  additional  set  of  conditional  probabilities  which  needs  to  be  com- 
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municated . 


When  the  random  elements  involved  are  all  Gaussian,  the  sufficient 
statistics  for  the  conditional  probabilities  become  the  conditional 
means  and  covariances.  With  the  help  of  Lemma  1A,  we  can  again  apply 
the  algorithm.  Instead  of  multiplication  and  division  of  probabilities, 
however,  we  now  have  operations  involving  conditional  means  and  covari¬ 
ances.  The  results  are  straightforward  and  will  not  be  presented  here. 

4.2  STATIC  EXAMPLES 

In  the  following  we  assume  the  measurements  are  made  at  times  {..., 
t-1,  t+1,...}  and  messages  are  received  at  times  {...,  s-1,  s+1,...} 
with  s-1  <  t  <  s. 

Example  i :  (Fusion  Without  Coordination) 

Consider  the  fusion  time  s.  Let  t  be  the  observation  time  immedi¬ 
ately  before  s.  With  the  information  graph  it  is  easy  to  see  that 

Z(s-l.l)  D  Z(t,2)  =  ZCt-1,2).  (4.14) 

Thus 

p(xlZ(s-l,l)  U  Z(t,2))  =  C  pljL]Z(s-l ,  1 ) ) 

p(x |Z(t-l ,2) ) 

By  a  recursive  argument,  we  can  show  that 


p(xlZ(s  ,1)  )  =  C  ^  p(x  IZ(s-l ,  I ) ) . 


(4.15) 


Each  term  in  the  product  contains  the  new  information  contained  in 
the  new  measurement  z(t,i)  of  agent  i.  All  other  information  is  already 
known  to  agent  1.  The  fusion  problems  of  the  other  agents  are  similar. 

Example  2:  (Fusion  with  Coordination  or  Broadcast  System)  From  the 
information  graph,  we  see  that  for  all  jt 

n  Z(t , i)  =  Z(s-l,j).  (4.16) 

i 

Thus,  the  algorithm  gives  for  j, 

p(x |Z(s ,  j )  )  =  C  n  ^l|ZU~-l^i) )  p(x  |Z(s-l ,  j )  ),  (4.17) 

Each  term  in  the  product  is  the  new  information  contained  in  measurement 
z(t,i) . 

Example  3.:  (Cyclic  Communication) 

Z(t,l)  HZ(t,2)  =  Z(t-2 , 1 )  UZ(t-l,2),  (4.18) 

and 

Z(t-2, 1 )  H  Z(t-1 , 2)  =  Z(s-3, 1 ) 


Z(t-3 , 1)  U  Z(t-3,2) . 


(4.19) 


Thus,  in  addition  to  the  most  current  conditional  probability 
p(x|Z(t,l)),  agent  1  has  to  remember  three  other  probabilities.  Note 
that  p(x |Z(t-l , 2) )  is  available  to  agent  1  from  earlier  communications. 
This  indicates  that  in  a  distributed  estimation  network,  knowing  the 
most  recent  estimate  is  frequently  not  sufficient  if  one  wants  to 
recover  the  globally  optimal  estimate.  In  fact,  it  has  been  shown  via 
simulation  in  [2]  that  if  a  suboptimal  rule  of  combining  estimates  is 
used,  such  as 

p(x|Z(t,l)  U  Z(t ,2)  )  «  C  p(xlZ(t,l))  p(x |Z(t , 2) )  (4.21) 


for  agent  1  and  similar  rules  for  agents  2  and  3,  the  agents  agree 
asymptotically.  This  is  consistent  with  the  results  on  asymptotic 
agreement  in  distributed  estimation  as  given  in  [14].  However,  the 
agents  can  converge  to  the  wrong  estimate  as  demonstrated  in  [2].  Thus, 
although  optimal  fusion  algorithms  are  in  general  more  complicated, 
requiring  more  memory  and  more  computation,  they  are  nonetheless  neces¬ 
sary  if  good  performance  is  needed.  A  suboptimal  algorithm  has  also 
been  tested  in  [2]  and  shown  to  have  some  nice  properties. 

Example  4  (Multipath  Pattern) 
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The  fusion  problems  of  agents  2  and  3  are  straightforward.  For 
agent  1,  repeated  use  of  the  algorithm  (with  the  help  of  the  information 
graph  in  Figure  5)  gives 


p(x|Z(s,l))  = 


c  p(x I Z( t , 2 ) ) _ p(x |Z(t .3) )  p(x I Z( t-2 .4) ) 

p(x|Z(t-l,2) )  p(x |Z(t-l j3) )  p(x|Z(t-l ,4) ) 


p(x I Z ( t ,1) ) 


(4.22) 


In  addition  to  the  conditional  probabilities  from  agents  2  and  3, 
conditional  probabilities  by  agent  4  are  also  needed.  These  would  have 
to  be  relayed  by  agents  2  or  3. 

In  the  above  examples,  general  fusion  formulas  are  given.  If  the 
random  vectors  are  all  Gaussian,  these  formulas  can  be  simplified  using 


Lemma  1A. 


5.  DYNAMIC  RESULTS 


Assume  now  that  x(.)  is  a  Markov  process.  The  fusion  problem  for 
each  agent  will  now  be  considered.  Since  the  data  sets  are  no  longer 
conditionally  independent  given  x(t),  one  immediate  question  is  the 
choice  of  an  appropriate  "state"  y  whose  conditional  probabilities  would 
be  computed,  transmitted  and  combined  by  the  various  agents.  Let  T(t,i) 
be 

T(t , i)  =  {t'  €  T|(t' ,i')  €  K(t,i)>,  (5.1) 

and 


y  =  (x(t’)) 

t* 


€  T(t,i) 


(5.2) 


for  each  information  node  (t,i)  where  fusion  is  to  be  performed.  Then 
the  problem  is  effectively  reduced  to  a  static  problem  of  the  type  con¬ 
sidered  in  Section  4.  Using  the  independence  assumptions  on  the  meas¬ 
urements  in  the  data  base  given  y,  the  algorithm  in  Section  4  can  be 
applied.  However,  this  means  that  the  conditional  probability  of  a  high 
dimensional  random  vector  y  would  have  to  be  stored  and  transmitted. 

From  an  implementational  point  of  view,  this  may  not  be  feasible. 


For  deterministic  random  processes,  which  can  be  characterized  by 
the  state  at  one  given  time,  an  obvious  choice  is  to  estimate  x(tQ) 
where  tQ  is  the  minimum  in  the  set  T.  Again,  due  to  the  Markov 
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property,  the  conditional  independence  assumption  is  satisfied  and  the 
algorithm  can  be  used.  However,  if  there  are  substantial  changes  in  the 
process,  x(tQ)  may  not  be  the  state  of  interest.  In  this  section,  we 
characterize  the  more  current  states  whose  conditional  probabilities 
ought  to  be  transmitted  and  combined. 


The  following  generalization  of  Lemma  1  is  needed. 


Lemma  4 


Consider  a  random  vector  y  and  data  bases  Z^  and  Z2  defined  on  the 
information  graph.  Suppose 


P ( Z j\ Z2 > Z2\Z^ 1 ^  1  ^  Z2,y) 


-  p(Z1\Z2ly,Zin  z2)  p(Z2\Z1ly,Z1  O  Z2) 


(5.3) 


p(ylz.)  p(ylz2) 

p(y,2l  U  V  =  C  ptylz'n  Z2) 


(5.4) 


where  C  is  a  normalization  constant. 


Proof 


By  Bayes'  rule  and  (5.3),  we  have 


p(Zj  (J  Z2,y)  =  p( Z j\  Z2 , Z2\Z j , Z j  O  Z2,y) 
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-  p (Z ^\Z2 »Z2''^1  ^ Z1  ^  ^  Z2’^ 

=  p(z1\z2iz1  n  z2,y)P(z2\z1iz1  n  z2,y)P(z1  n  z2>y) 

p(Z1\Z2,Z1  Pi  Z2Jy)p(Z2\Z1,Z1  n  Z2  ,  y ) 
p(zx  n  z2,y) 

p(Z,,y)  p(z2,y) 

p(Zj  n  Z2,y)  *  (5-5) 

The  lemma  then  follows  naturally. 

Lemma  4  states  that  even  though  the  individual  measurements  in  1_  do 
not  satisfy  the  conditional  independent  assumptions  given  y,  Equation 
(5.4)  (which  is  the  same  as  (4.1))  is  still  valid  provided  the  private 
data  bases  Zj\Z2>  Z2^ Zl  are  conditionally  independent  given  the  state  y 
and  the  common  information  Z^  H  Z2 . 

We  can  now  state  the  following  theorem  which  characterizes  the 
state  vector  which  should  be  estimated  for  deterministic  dynamic  random 
processes . 

Theorem 

Consider  the  fusion  problem  for  the  information  node  (t,i)  assuming 
a  deterministic  random  process  x.  If  the  algorithm  of  Section  4  yields 
the  fusion  formula 

p(x|Z(t,i))  »  F(p(x|Z(t',i'));(t',i')  €  L(t,i))  (5.6) 
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where  F  is  a  function  consisting  of  products  and  ratios  of 
p(x I Z ( t ' , i' ) )' s  in  the  set  L(t,i),  and  L(t,i)  is  a  subset  of  the  prede¬ 
cessor  information  nodes  of  (t,i).  Then  for  a  deterministic  random  pro¬ 
cess  x(.),  equation  (5.6)  holds  with  x  replaced  by  x(t*),  where 

t*  =  min{t'!(t,s)  €  L(t , i)\{(t  ",  s") ) )  (5.7) 

and  (t",s)  is  the  minimal  element  in  L(t,i). 

The  proof  is  straightforward  and  is  based  on  the  algorithm  of  Sec¬ 
tion  4  and  Lemma  4. 

This  theorem  states  that  for  random  processes,  in  general  the  fil¬ 
tered  estimate  represented  by  the  conditional  probabilities 
p(x(t) I Z( t , i) )  may  not  be  adequate  for  optimal  fusion  at  time  t.  Some¬ 
times  the  agents  need  to  have  the  conditional  probabilities  of  the 
states  at  some  earlier  times.  Thus,  smoothed  estimates  are  frequently 
needed.  From  this,  the  estimates  of  the  current  states  can  be  obtained 
easily  by  extrapolation,  e.g., 

p(x(t) 1 2 ( t , i) )  =  j p(x(t) |x(s) )  p ( x ( s ) | Z ( t , i ) )  dx(s).  (5.8) 

When  this  theorem  is  applied  to  the  examples  in  Section  4,  we 
obtain  the  following  results. 

Example  1.:  (Fusion  without  Coordination) 


In  the  fusion  equation  (4.15),  the  state  to  be  estimated  is  x(t). 
This  is  consistent  with  the  results  in  [8]— [12] . 


As  a  variation  of  this,  consider  a  periodic  fusion  situation  where 
the  local  agents  acquire  measurements  at  a  higher  rate  than  they  commun¬ 
icate  with  the  fusion  agent  (Figure  6).  Specifically,  let  the  new 
fusion  time  set  for  agent  1  be  ( . . . , s-M, s , s+M, . . , }where  M  is  the  number 
of  time  units  between  communication. 


Thus,  the  state  of  interest  is  now  x(t-M-l),  and  each  term  in  the 
product  contains  the  new  information  of  agent  i  about  this  state. 

Example  2^  (Fusion  with  Coordination) 

In  equation  (4.17),  the  state  is  x(t). 

Example  3.:  (Cyclic  Communication) 

In  equation  (4,18),  the  state  is  x(t-2).  Thus,  extrapolation  is 
needed  if  the  estimate  of  x(t)  is  needed. 

Example  4:  (Multipath  Pattern) 
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6.  CONCLUSION 


We  have  presented  a  formalism  for  the  distributed  estimation  prob¬ 
lem.  Using  this  formalism,  the  optimal  fusion  algorithm  for  each  agent 
in  the  network  has  been  developed  for  arbitrary  network  structures. 

Both  results  for  static  and  deterministic  dynamic  random  states  have 
been  described,  and  illustrated  with  examples. 

The  results  have  been  presented  for  very  general  state  and  observa 
tion  models.  Special  cases  such  as  linear  models  with  Gaussian  noises 
can  be  considered.  An  interesting  special  case  for  distributed  multi¬ 
target  tracking  and  classification  has  also  been  investigated  and 
briefly  reported  in  [2],  The  details  will  appear  elsewhere. 
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APPENDIX  C 


DISTRIBUTED  MULTITARGET  TRACKING  AMD  CLASSIFICATION  - 
A  BAYESIAN  APPROACH 

C.Y.  Chong  and  S.  Mori 

ABSTRACT 

The  tracking  and  classification  of  multiple  targets  by  a  network  of 
processing  agents  (processors)  is  considered.  A  Bayesian  approach  is 
adopted  as  the  theoretical  basis.  Each  agent  processes  the  local  sensor 
data  to  obtain  the  local  information  state  consisting  of  the  local 
hypothesis,  tracks  and  the  relevant  probabilities  and  state  distribu¬ 
tions.  These  are  communicated  to  the  other  agents  by  means  of  the  com¬ 
munication  network.  From  these,  each  agent  tries  to  construct  the  glo¬ 
bal  information  state  conditioned  on  the  data  which  would  be  available 
if  they  were  communicated  through  the  network.  Both  results  for  static 
and  dynamic  target  models  are  presented  assuming  broadcast  type  communi¬ 
cation. 


1 .  INTRODUCTION 


The  tracking  and  classification  of  multiple  targets  is  very  impor¬ 
tant  for  many  civilian  and  military  applications.  It  is  also  interest¬ 
ing  from  a  theoretical  standpoint  since  it  is  essentially  different  from 
classical  estimation  problems  in  that  the  origins  of  the  measurements 
are  uncertain.  Many  algorithms  for  multi-target  tracking  have  been  pro¬ 
posed.  Surveys  of  the  area  can  be  found  in  the  paper  by  Bar-Shalom  [1] 
and  the  Naval  Ocean  Surveillance  Correlation  Handbooks  [2],  [3].  The 
paper  by  Reid  [4]  also  contains  a  good  survey  of  the  then  existing 
methods.  Recently,  a  general  theory  for  the  tracking  and  classification 
of  multiple  targets  based  on  a  Bayesian  approach  has  been  proposed  in 
[5]  and  16] .  Much  of  the  work,  however,  assumes  a  centralized  process¬ 
ing  architecture  in  that  the  sensor  measurements  are  transmitted  to  a 
single  processor  where  they  are  processed. 

In  many  applications,  however,  the  sensor  measurements  are  not  all 
transmitted  to  a  central  processor.  Instead,  a  set  of  local  processors 
are  present  and  each  processor  handles  the  measurements  from  a  subset  of 
the  sensors.  Each  processor  does  some  local  tracking  and  communicates 
the  results  to  other  processors  where  the  incoming  information  is  com¬ 
bined  or  fused  with  the  local  information.  Such  architectures  are 
present  whenever  tracking  is  carried  out  by  multiple  processors  who  com¬ 
municate.  The  distributed  sensor  network  is  an  example  of  such  systems 
17],  18]. 
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In  recent  years,  there  has  been  growing  interest  in  distributed 
estimation  problems  [9]  -  [21].  Most  of  the  work  deals  with  the  estima¬ 
tion  of  a  random  process  or  hypothesis  testing  assuming  the  origins  of 
measurements  are  known.  Exceptions  can  be  found  in  [22],  [23]  which 
presents  some  ad_  hoc  schemes  for  distributed  multitarget  tracking  and 
[8],  which  briefly  outlines  some  theoretic  results.  Some  specific 
results  have  also  been  considered  in  [24],  [25]  which  consider  the  prob¬ 
lem  of  correlation  of  tracks  from  multiple  nodes.  This  work  is,  how¬ 
ever,  quite  ac[  hoc  and  not  related  to  any  theory  of  multitarget  track¬ 
ing. 

In  this  paper  we  present  a  theory  for  distributed  multitarget 
tracking  and  classification  assuming  the  independent  and  identically 
distributed  target  models  of  [5]  or  [6].  Each  processor  forms  the 
data-association  hypotheses,  tracks  and  the  various  associated  probabil¬ 
ities  and  communicates  these  to  the  other  processors  through  the  net¬ 
work.  Upon  receiving  these,  each  node  tries  to  reconstruct  the  global 
hypotheses,  tracks,  probabilities  of  hypotheses  and  the  state  distribu¬ 
tions  of  the  tracks  as  if  the  sensor  measurements  were  available  through 
the  network.  The  theoretical  framework  introduced  in  [21]  for  distri¬ 
buted  estimation  and  the  theory  of  multitarget  tracking  of  [5],  [6]  are 
used  to  derive  the  fusion  algorithms  for  each  processor  in  the  network. 
Although  the  philosophy  can  be  used  for  general  communication  struc¬ 
tures,  the  special  case  of  broadcast  communication  has  been  used  to 
illustrate  the  algorithm. 


The  structure  of  this  paper  is  as  follows.  In  Section  2,  we  present 
the  basic  target  and  sensor  models  used.  The  information  structure  of 
the  system,  which  depends  on  the  communication  network,  is  also  intro¬ 
duced.  Section  3  deals  with  the  notions  of  tracks  and  hypotheses  and 
defines  the  distributed  multitarget  tracking  and  classification  problem. 
The  main  results  for  tracking  of  stationary  targets  assuming  broadcast 
type  communication  are  described  in  Section  4.  In  particular  we  discuss 
the  construction  of  the  global  hypotheses  from  the  local  hypotheses  and 
the  hypothesis  evaluation  problem.  The  extension  of  these  results  to 
dynamic  target  models  is  given  in  Section  5.  Section  6  contains  the 


conclusion. 


2.  MODELS 


The  main  difference  between  distributed  multitarget  tracking  and 
centralized  multitarget  tracking  is  in  the  presence  of  multiple  tracking 
agents.  Thus  the  target  and  sensor  models  would  be  identical  to  the 
centralized  case  [5],  [6],  but  additional  constraints  or  models  would 
describe  the  information  available  to  each  node,  i.e.,  the  information 
structure.  In  the  following  we  shall  discuss  the  three  models 
separately. 


2.1  TARGET  MODEL 


A  general  target  model  used  in  multitarget  tracking  and  classifica¬ 
tion  has  been  described  in  [5],  [6].  Although  a  theory  of  distributed 
multitarget  tracking  and  classification  can  be  developed  using  general 
models,  our  emphasis  in  this  paper  is  on  a  special  but  widely  applicable 
target  model,  namely,  independent  and  identically  distributed  (i.i.d.) 
target  models.  Specifically,  the  target  system  state  at  any  time  t  is 


((x.(t))  ,  N  ) 

i=l 


(2.1) 


where  is  the  constant  but  unknown  number  of  targets  and  x^(t)  is  the 
state  of  the  i1*1  target  at  time  t.  The  a  priori  distribution  of  is 

nt 

Poisson  with  mean  VQ.  Given  NT,  (x(t))  is  a  system  of  independent 

1  i=l 
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and  identically  distributed  Markov  processes  on  the  common  target  state 
space  X.  Each  x^(.)  has  the  same  initial  distribution/density  qQ  and 
the  transition  probability  density  f,  i.e., 

Prob.{xi(tQ)  e  dx>  =  q0(x)p(dx)  (2.2) 

Prob.{x^(t  +  At)  £  dx|x^(t)  =  x' }  =  f^t(x|x' )u(dx)  (2.3) 

where  M  is  the  hybrid  measure  on  X  which  is  a  hybrid  set.  A  hybrid  set 
is  the  direct  product  of  a  subset  of  Euclidean  space  and  a  finite  set 
and  a  hybrid  measure  is  the  direct  product  measure  of  the  Lebesgue  and 
counting  (discrete)  measures.  Thus  each  x^(t)  consists  of  a  continuous 
part  corresponding  to  position,  velocity,  etc.,  and  a  discrete  part 
corresponding  to  target  type,  maneuvering  mode,  etc. 

2.2  SENSOR  MODEL 

We  assume  there  is  a  system  of  sensors  called  S.  For  each  sensor  s 
in  S,  the  sensor  output  space  Zg  is 

00 

z  =  U  (Y  )m  x  {m>  (2.4) 

S  m=0  S 

where  Yg  is  the  measurement  space  for  sensor  s.  Each  sensor  output 
m 

((y.)  ,  m)  means  that  m  measurements  are  generated  and  the  measurement 

1  i=l 

values  are  y^,..,  yffl.  in  general,  Yg  is  a  hybrid  set,  where  the  con¬ 
tinuous  part  is  used  for  analog  information  such  as  position  and  velo¬ 
city  and  the  discrete  part  is  used  for  feature-type  information  such  as 


C-6 


size/cross-section,  classification,  etc. 


A  Random  Element 

NM 

(<yj  ,  N  ,t,s)  6  U  z  x  T  X  {s>  (2.5) 

J  j=l  s  6  S  s 

is  called  a  data  set .  It  represents  the  event  that  measurements 

y,  , . . yN  are  generated  by  sensor  s  at  time  t.  Given  a  target  system 
M 

nt 

state  ((x. (t))  ,  Nt),  the  data  set  is  generated  via  the  following  four 

i=l 

steps : 

a.  Detection 

Let  1^  =  {1,...,Nt>  be  the  set  of  target  indices.  Then  the  set  of 
targets  detected  by  a  sensor  s  at  time  t  is  a  random  subset  ID(t,s)  of 
1^  which  can  be  characterized  by  its  indicator  function  F^(t,s). 

FD(t,s)  which  is  a  random  binary  function  with  domain  1^,  is  called  the 
detection  function.  Fjj(t,s)(i)  =  1  means  that  target  i  is  detected  by 
sensor  s  at  time  t  and  0  means  that  it  is  not  detected.  We  assume  that 
every  Fjj(t,s)(i)  depends  only  on  target  i's  state  x^(t)  and  that  ther.' 
exists  a  common  detection  probability  function  pD(x|t,s)  such  that 
Prob.{FD{t,s)(i)=l|xi(t),NT>  =  pD<xi(t) lt,s) . 

b. *  Number  of  False  Alarms  Generat  ion 


We  assume  that  the  order  in  which  the  measurements  arrive  in  a  data 
set  does  not  contain  any  information  about  the  targets.  If  not,  the 
data  set  should  be  further  subdivided  until  this  assumption  holds.  Let 
A(t,s)  be  the  random  assignment  function  which  assigns  the  detected  tar¬ 
gets  to  the  measurements.  Since  the  order  of  the  measurements  does  not 
contain  any  useful  information,  the  probability  of  A(t,s)  taking  on  any 
possible  assignment  ex  is  uniform. 

d. .  Measurement  Value  Generation 

The  value  of  a  false  alarm  is  an  independent  random  variable  (vec¬ 
tor)  and  has  a  common  probability  distribution/density  Pp^(yjlt.s).  For 
any  detected  target  x^,  given  an  assignment  A(t,s)  =o(,  the  correspond¬ 
ing  measurement  value  ya(jj  is  an  independent  random  vector  with  a  tran¬ 
sition  probability  density 


Pm(^a(i)  |xj.> • 


(2.8) 


In  the  above  description  of  the  general  sensor  model,  we  have  made 
the  usual  assumptions  that  there  are  no  merged  measurements  or  split 
measurements . 

2.3  INFORMATION  STRUCTURE 

The  information  structure  is  the  additional  component  which  defines 
a  distributed  multitarget  tracking  and  classification  problem.  Let  N  be 
the  finite  set  of  tracking  agents  (nodes).  Let  be  the  set  of  sensors 
reporting  to  node  n.  We  make  the  following  assumptions  on  the  Sn"s: 

a.  S  *  u  S  , 

n 

n 

b.  Sr  n  sn,  ®<J>  for  n  f  n'  .  (2.9) 

These  assumptions  state  that  the  sensor  sets  for  the  various  tracking 
nodes  are  mutually  disjoint  but  collectively  exhaust  all  the  sensors. 

The  tracking  nodes  communicate  to  one  another  according  to  the  com¬ 
munication  schedules  C..  (t,n^,n2)  E  C  means  that  node  n^  communicates 

to  node  n^  at  time  t. 

Let  T  be  the  time  interval  of  interest  and  (z,t,s)  =  (z(k),k)  be 
the  data  set  from  sensor  s  at  t.  Let  Z.  be  the  set  of  all  data  sets  and 
K  be  the  set  of  all  data  set  indices  (t,  s)  =  k.  At  any  time  t,  the 
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maximum  information  available  to  the  entire  system  is  given  by  the  sets 

Z(t  -  {(z.t'.s)  €  Zlt'  <.  t>  (2.10) 

and 

K|t  =  {(t',s)  g  K|t'  <.t>.  (2.11) 

Because  of  communication  constraints,  the  actual  information  available 
to  a  node  s  at  time  t  is  less.  Consider  the  set  of  events  when  the 
information  in  the  system  changes  (either  through  transmission  or  recep¬ 
tion).  The  times  when  these  events  occur  and  the  nodes  (sensors  or 
tracking  nodes)  which  are  affected  are  given  below: 

-  sensor  observation:  K 

-  reception  of  sensor  data  by  a  tracker: 

{(t,n)  e  T  x  N | (t , s)  e  K,  s  e  S  ) 

n 

-  transmission  by  a  tracker: 

{(t,n)  e  T  x  N|(t,n,n')  e  C} 

-  reception  by  a  tracker: 

{(t,n)  e  T  x  N|(t,n',n)  e  C> 

To  avoid  unnecessary  complexity,  we  assume  (without  loss  of  gen¬ 
erality)  that  the  four  sets  defined  above  are  disjoint.  Let  X  be  the 
union  of  the  four  sets.  A  binary  relation  or  a  partial  order  ^  can  be 
defined  on  the  set  1  as  follows: 
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i.  For  (n,t,t')  £  N  x  T  x  T,  (t,n)  £  I.,  (t' ,n)  £  I. 
and  t  <  t'  implies  that 

(t  ,n)  •<  (t'  ,n) ; 

ii.  (t,s)  £  K,  s  e  and  (t,n)  £  I.  implies  that 
(t,s)  *<  (t,n); 

iii.  (t,n,n')  £  C.  implies  that 
(t,n)  -<  (t,n' ) . 

This  binary  relation  or  partial  order  on  I.  thus  satisfies  all  the 
constraints  associated  with  perfect  communication  as  defined  by  C.  as 
well  as  perfect  memory  at  each  processing  node.  (Iy<)  characterizes  the 
information  flow  in  the  system  and  is  called  the  information  graph.  If 
all  the  sensor  measurements  (data  sets)  can  be  communicated  perfectly 
through  the  communication  network,  a  subset  Z(t,i)  of  Z  (called  the  data 
base  at  (t,i))  for  each  node  (t,i)  in  the  graph  (I.,0  can  be  defined  by 
beginning  with  the  minimal  elements  and  following  the  rules  shown  below: 

i.  If  (t,i)  is  a  receiving  node, 

Z(t,i)  -  U(Z(s, j) | (s, j)  ->  (t,i)>, 
ii.  If  (t,i)  is  a  transmitting  node, 

Z(s , j )  if  (s,j)  ->  (t,i) 

Z(t, i)  =<  {(z(k),k)>  if  (t , i)  s  k  £  K 

.  <t>  otherwise. 
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In  the  above  (s,j)  ->  (t,i)  means  that  (s,j)  is  an  immediate  prede¬ 
cessor  of  (t,i)  and  (z(k),k)  £  Z  is  the  unique  element  whose  second  com¬ 
ponent  is  k. 

With  this  construction  of  the  data  base,  we  see  that  (t,i)-<  (s,j) 
if  and  only  if  Z(t,i)  C  Z(s,j).  Similarly,  for  each  (t,i)  e  I,  we  can 
define  the  data  index  base  K(t,i),  which  correspond  to  the  indices  for 
the  data  sets  in  Z(t,i). 

Since  there  is  a  natural  direction  (along  increasing  time)  in  the 
graph,  the  arrowheads  on  the  edges  in  a  pictorial  representation  of  the 
graph  can  be  omitted.  We  would  also  omic  those  edges  which  are  due  to 
transitivity.  From  the  graph,  the  flow  of  information  in  the  system 
becomes  very  obvious.  A  node  (t,i)  is  a  parent  of  (s,j)  if  information 
flows  from  node  i  at  time  t  to  node  j  at  time  s. 

Mote  that  in  the  information  graph,  the  receiving  nodes  correspond 
to  the  events  when  estimates  have  to  be  updated  with  the  arrival  of  new 
information.  For  many  applications,  it  is  sufficient  to  use  a  reduced 
information  graph,  which  is  obtained  by  considering  only  these  receiving 
nodes . 

Thus  the  maximum  information  available  to  each  information  node 
(t,i)  in  the  information  graph  is  given  by  the  data  base  Z(t,i)  or 
alternatively  by  the  data  index  base  K(t,i).  We  are  particularly 
interested  in  information  nodes  which  correspond  to  the  tracking  nodes. 


3 .  PROBLEM  FORMULATION 


Our  objective  is  to  consider  a  distributed  version  of  multitarget 
tracking  and  classification  problem  as  given  in  [5],  [6],  In  particu¬ 
lar,  we  would  like  to  evaluate  the  probability  of  each  data  association 
hypothesis  using  information  communicated  from  other  nodes.  To  this 
end,  we  shall  first  define  the  notions  of  tracks  and  hypotheses  in  this 
distributed  framework. 

3.1  TRACKS  AND  HYPOTHESES 

We  define  the  measurement  index  set  J.  by 

J  =  U  U,...,  N  (k) }  x  {k> .  (3.1) 

k  e  K 

An  element  (j,t,s)  in  J.  (called  a  measurement  index)  indicates  that  the 
j1*1  measurement  in  the  data  set  generated  by  sensor  s  at  time  t.  Any 
subset  of  .J  is  called  a  track  and  any  collection  of  nonempty  tracks  a 
hypothesis .  A  track  is  called  possible  if  it  contains  at  most  one  meas¬ 
urement  index  in  each  data  set.  A  hypothesis  is  called  possible  if  it 
contains  only  possible  nonempty  tracks  and  no  two  tracks  in  it  inter¬ 
sect.  Let  ^4and  7  be  the  set  of  all  possible  hypotheses  and  tracks. 

When  K  is  a  subset  of  K,  define 

J(K)  =  {( j , k)  £  J|k  £  K}.  (3.2) 
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Then  J(K)  is  the  measurement  index  set  restricted  by  the  data  index  base 
K.  Similarly,  if  J  is  a  subject  of  J.,  define 

^(J)  =  {  T  n  J  |  teT>  (3.3) 

and 

^(J)  -  {(tn  Jl  T  eX>  \  {<\>}\  X  ell>-  (3.4) 

Thus  3lJ)  is  the  set  of  all  possible  tracks  defined  on  J  and  'ti(J)  is  the 
set  of  all  possible  hypothesis  defined  on  J. 

3.2  DISTRIBUTED  FUSION  PROBLEM 

Consider  the  reduced  information  graph  which  is  constructed  from 
the  information  graph  by  picking  out  the  tracking  nodes.  Each  node  in 
the  graph  is  of  the  form  (t,n)  where  t  is  a  reception  time  (from  a  local 
sensor  or  other  nodes)  and  n  is  a  tracking  node.  Let 

J(t,n)  =  J(K(t,n)),  i.e.,  the  measurement  index  set  available  to  agent  n 
at  time  t.  Assume  that  the  information  state  for  multitarget  tracking 
and  classification  is  given  at  each  node,  i.e.,  for  each  node  (t,n),  we 
have  the  following  set  of  quantities: 

^cj( J(t ,n) ) ,  (pUIZ(t.n)))  Xe^(J(t>n)). 

cJ(J(t,n)),  (p(x(t) IZ(t,n),x))  ^  £  cjjj(t  n) )»  v(K(t ,n) )  (3.5) 

where  v(K)  is  the  expected  number  of  targets  which  are 


undetected  in  K  and  p(x(t) IZ(t,n) ,T)  is  the  probability 
distribution/density  of  x(t)  given  the  track  T  and  the  data  base  Z(t,n). 
To  simplify  the  notation,  we  denote  the  above  by  the  information  node 
(t,n),  i.e., 

>A(t,n),  (p(X  |Z(  t  ,n)  ) )  Xe^(t>n)> 

Tf(t,n),  (p(x(t)  |Z(t,n),x))  T  £rj(t  n^»  vCt,n)  (3.6) 

At  each  information  node  (t,n),  the  information  state  is  to  be 
updated.  For  a  node  corresponding  to  reception  of  sensor  data  at  a 
tracking  agent,  the  problem  is  straightforward  and  is  the  centralized 
multitarget  tracking  problem.  For  a  node  corresponding  to  reception  of 
messages  from  other  tracking  agents,  the  problem  is  one  of  distributed 
fusion,  i.e.,  to  construct  the  information  state  using  the  information 
states  from  the  predecessor  nodes  in  the  information  graph.  This  prob¬ 
lem  can  be  interpreted  as  follows.  Suppose  the  information  states  for 
multitarget  tracking  and  classification  (hypotheses,  tracks,  probabili¬ 
ties,  etc.)  are  the  messages  communicated  in  the  network.  Each  agent 
then  tries  to  construct  the  results  of  the  optimal  tracker  if  the  actual 
measurements  were  communicated  through  the  network  using  only  informa¬ 


tion  states  which  can  be  communicated. 


4.  STATIONARY  TARGETS  WITH  BROADCAST  COMMUNICATION 


In  this  section,  we  consider  a  special  case  to  develop  the  basic 
results.  These  results  would  then  serve  as  a  basis  for  studying  more 
complex  situations.  The  target  state  is  assumed  to  be  stationary,  i.e., 
x^(t)  =  x^  for  all  t.  The  communication  is  assumed  to  be  the  broadcast 
type,  i.e., 

00 

C=  U  U  {(t.,n,,n2)>  (4.1) 

i=l  n^#n2 

With  this,  the  information  graph  and  reduced  information  graph  are  given 
in  Figure  1. 

Consider  two  consecutive  communication  times  t^  and  t'^. 

Let 

J  =  J(K .  ),  (4.2) 

1  b 

and 

J  A  J(K i  ,  ),  (4.3) 

b 

i.e.,  the  cumulative  measurement  indices  at  tfa  and  t'b  respectively. 
Also,  define  for  each  n  in  N, 
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(4.4) 


Jn  =  {(m, t , s)  e  J|s  e  Sn>  U  J. 

Then  J  is  the  cumulative  measurement  index  set  available  to  each  track- 
n 

ing  agent  n  just  before  communication.  Kn  can  be  defined  similarly. 

Let  K  =  K,  and  K  s  K ,  .  .  Let  Z  ,  Z  and  Z  be  the  cumulative  measure- 
)tb  |C  b  n 

ments  corresponding  to  Jn>  J  and  J  respectively. 

We  have  to  answer  the  following  basic  questions. 

21.  Given  T<3).  H<3) ,  (T(Jn))n  £  { 

can  we  construct  T( J)  and  H( J)? 

Q2.  Given  (p  (A  |  Z) )  A  £  ( (P  ( A  |  Zr)  )  A  £  H(J^)n  e 

(px(tfa)  |Z,t)t  €  y  ((p(x(tb)|Zn,T)T  €  7-(J^))n  e  N» 
v(K)  and  (v(^n)>n  £  N>  can  we  calculate  (p(A(z))A  £  H(J)' 
p(x(tb) |z,t)t  €  T(J)  and  v(K)? 

These  questions  will  be  considered  separately. 

4.1  HYPOTHESIS  RECONSTRUCTION 

We  first  address  Question  1,  which  focusses  on  the  construction  of 
the  global  hypothesis  from  the  local  hypotheses. 

Definition:  Let  x  e^J)  be  a  track  and  J  C  J,  the  restriction  of  T 
onto  J  is  defined  as  t|J  =  x  fl  J.  T  *  x  |J  is  then  a  predecessor  of  x 
and  x  is  a  successor  of  x.  Similarly,  when  X  £  %(J)  and  J  C,  J,  the 
predecessor  X  -  A I J  is  defined  as 
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XU  =  { r|  J  |x  e  X>\  {i|)>  . 


(4.5 


Then  X  is  the  successor  of  X. 

The  following  two  lemmas  are  then  obvious. 

Lemma  1_:  Let  and  be  two  measurement  index  sets  such  that  C 

and  be  a  track  in  a  hypothesis  X  Jj).  Then  for  any  successor  X^ 
of  Xj ,  i.e.,  for  any  X^  e)J(J2)  such  that  X^lJ^  =  \»  there  exists  a  r2 
in  X2  such  that  =  •  For  given  X^,  such  a  track  t  is  unique. 

Lemma  2 :  Let  and  J2  be  two  measurement  index  sets  such  that  C  J2. 

Then  for  any  X  we  have 

Prob.iXlJj  |X, J2}  =  1  (4.6] 


whenever  Pr0b.IX.J2>  >  0. 

Def  in  it  ion;  A  hypothesis  X  in  ')i(J)  is  said  to  be  composed  of 

if  X|J  ■  X  for  all  n  e  N.  The  relationship  is 
n  £  N  n  n 

denoted  by 


An  immediate  property  as  a  result  of  this  definition  is  that,  if  A 

is  composed  of  ( )n  £  N  e  II  ^(J^,  then  all  the^Xn's  should  share 

n  e  N 

the  same  predecessor.  Namely,  we  have 
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Lemma  3:  If  A  (A  )  _  ... 

-  n  n  £  N 

A  |J  =  M3  (4.8) 

□ 

for  all  n  £  N. 

The  proof  is  obvious  from  the  fact  that  J  C  for  all  n  £  N. 


A  useful  criterion  for  any  (A,  (An)R  e  jg)  to  be  tested  for  composa- 
bility  is  given  by  the  following  lemma. 

Lemma  4:  A  (A  )  if  and  only  if,  for  any  T  in  A  there  exists  a 

r  n  n  e  N 

tuple 


(Vn  £  N  £ 


n  e  N 


a  u  m) 

n 


(4.9) 


Proof  of  Lemma  4:  The  "if"  part  is  obvious.  For  "only  if"  part,  sup¬ 
pose  A  b»-  (A  )  and  tgA  .  For  each  n  £  N,  let  t  =  x(j  .  Then, 

1  n  n  e  n  n  n 

since  A  *  A|J  we  have  either  T  e  A  or  t  =  <J).  Since  J  =  u  J 
nn’  nnn  T-„n 

n  £  N 

U  T=  lj  t  |  j  m  (j  tDj  sTflj»T,  On  the  other  hand, 

n  e  N  n  n  £  N  n  n  e  N  n 
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J  c  J  implies  that  t  | J  =  (T | J  )  | J  =  t  D  J  Pi  J  =  T  P  J  =  x|j.  Thus, 
n  r  n  n  n  * 

Tn)n  p  is  an  appropriate  element  which  satisfies  the  lemma.  ■ 

The  following  theorem  forms  the  basis  for  hypothesis  reconstruc¬ 
tion. 


Theorem  1:  For  any  A  e')i(I)  and  any  (A)  e  II  'Vj(J), 

n  n  £  N  n  e  » 


X  r^Vn  e  N  if  and  only  if> 


(1)  for  any  T  in  A  there  exists  a  tuple  (x  )  e  II  (A  u  {<))}) 

n  n  £  N  n  e  N  n 

such  that 


a .  T  -  II  x  and 
n  e  N  n 


b.  TnU  =  TlJ  for  all  n  e  N,  and 


(2)  for  any  n  £  N  and  for  any  x^  e  A^  there  exists  a  unique  T  in  A 


such  that  x|j  =  x  . 

n  n 


Proof : 


The  "only  if"  part  is  obvious  from  the  definition  of  AJ>(A  ) 

~  n  n  £  N 

The  "if"  part  is  as  follows.  Suppose  (1)  and  (2)  hold.  (2)  is 

equivalent  to  A^  C  A | for  all  n  e  N.  On  the  other  hand,  (1)  implies 

that  An  D  AlJ^  as  can  be  shown  below.  Let  n  be  an  arbitrary  index  in  N 

and  be  an  arbitrary  track  in  AU^.  Then,  by  Lemma  1,  there  exists  a 

unique  extension  x  in  A  such  that  x  C  x  and  x|J  =  x  .  For  this  t,  by 

n  —  n  n  J 

(1)  there  exists  a  tuple  (x  , )  e  II  (A  ,  U  such  that 

n  n  £  w  _  n 


U  r"^"'  and  X.  |J  =  X|T  for  each  n'  6  N. 
n'  e  N  n  n 


Then  we  have 


TlJn  =  (  U  Xn')|Jn 
n'  e  N  n  n 


-<  u  A->njn 

n  €  N 


U 


n'  6  N 


n  wn 


-  x  U  (  u  x  n  J) 

n'  €  N 
n'  ^  n 


=  X  U  Ctl J)  =T  . 

n  n 


(4.11) 


Therefore,  ^  ^  •  But,  since  (and  hence 

c»^>-\eV 

This  theorem  provides  the  following  way  of  constructing  ^(J)  and 

(J). 

1.  For  each  5w  e>i(J),  exhaust  all  the  combinations  of 
an),‘e"€ne»1d<J")  such  that  XjJJ  = 

2.  For  each  such  (X  )  €  TT  'ttu  ), 

n  n  €•  N  ^  n 

n  6  N 


a.  construct  a  unique  extended  track  T  such  that  T  D  J  =X  by 

letting  T  =  U  T  where  rC  is  the  unique  extension  in  X  of 
n  e  N  n  n  n 


T  for  each  TT  €  X  and  be  the  set  of  such  tracks,  and 


b.  exhaust  all  the  combinations 


(Vn  e  N  e  TJ  „  (\  U  {^\hD  )  (where 

n  fc  N  n 

"\>LD  =  {'Ffi  Jnl  and  construct  new  tracks 

n 

rf  =  U  T  •  This  should  be  done  in  such  a  way  that  every *T 
n  €.  N  n 

in  every  is  included  in  one  of  the  tracks  in  the  composite 
hypothesis.  Let  L^^  be  the  set  of  all  the  hypotheses  con¬ 
structed  in  this  way.  Then,  all  the  hypotheses  Xin'MCJ)  such 

that  X  £>■  CX  )  is  the  set 

n  n  €  N 


*^)LD  ^  ^NEW*\eW  6  LNEW*‘ 


(4.12) 


The  construction  of  £T(J)  is  obvious  in  the  above  description. 


4.2  HYPOTHESIS  EVALUATION 

We  now  address  Question  2,  which  focusses  on  the  construction  of 
the  global  probabilities  of  hypotheses  and  state  distributions  of  tracks 
from  the  local  values.  We  state  the  following  lemma,  derived  in  [5], 
[6],  for  the  recursive  evaluation  of  hypothesis. 

Lemma  5.:  Consider  an  information  node  with  cumulative  measurements  Z 
and  cumulative  measurement  index  set  J.  Consider  an  immediate  successor 
with  cumulative  measurements  Z  and  cumulative  measurement  index  set  J. 
Let  k  *  (t,s)  be  the  most  current  data  index.  Then  the  recursive 
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evaluation  of  any  ^in'VilJ)  is  given  by 

pCXlZ)  =  C(Z)-1  p(5jZ)L_.(k»  TT  L.  (Y(T;k),T)  (4.13) 

k 

where  Y(T, k)  is  the  measurement  for  track  T  in  the  current  data  set 
indexed  by  k,  and  L^(Y(T,k),T)  are  likelihood  functions  defined  as 
follows : 

False  Alarm  Likelihood  Function 

LFA(k»  =  nFAaik)!  PN  <nFAWk))  TT  p  (y  Ik)  (4.14) 

FA  jejFAOlk)  J 

where  nFA(Xk)  is  the  number  and  jFA(Pvik)  the  set  of  false  alarms  in  the 
current  data  set  according  to  X 

Track-Measurement  Likelihood  Functions 
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In  equations  (4.15)  and  (4.17),  y  is  the  measurement  associated 

with  track  T,  p_(.)  =  p(x(t)|Z,T)  is  the  state  distr ibution/density  for 

X 

the  track  restricted  by  Z  and  q(.)  =  p^  =  p(x(t)IZ,<£)  is  the  density  of 
undetected  targets  associated  with  Z. 


A  recursive  application  of  Lemma  5  yields  the  following. 


Lamia  6:  Let  Z  be  any  data  base,  J  the  corresponding  measurement  index 
set  and  K  be  the  data  index  base.  Then  for  any  Z  such  as  Z  £z  (with 
the  corresponding  J  and  K), 

(1)  for  each  ^G'Vl(J), 


pOtZ)  =  C(Z)_1p(>IZ)  L  IT  L,,- 


(4.18) 


where  C  is  a  normalization  constant. 


"S  =  >|  J, 


(4.19) 


L„.  =  TT  _  Lj,^(t,s,^) 

FA  ( t , s )6K\K 


(4.20) 


n  ax(Y(T,t,s),T)  if  T|J  *  <$> 

(t,s)eK\K  ; 


v(K)  n  L.  . (Y(T,t,s),T)  otherwise 

(t,s)eK\K  u,s; 


(4.21) 


C-24 


(2)  for  any  T€.)J(J)  U 


II  G  (t,s)F(t,s)p_  if  T  =  t|J  t  <p 
(t,s)eK\K  T 


I  H  G  (t,s)F(t,s)q  otherwise 

l  (t,s)eK\K 


(4.22] 


with  q  *  p  ■  p ( .  I Z , <J>)  and  G^(t,8)  and  F(t,s)  are  operators 

defined  as 


((G^t.sXpJXx) 

(F(t.sXp)  )(x) 


g^r(t,s)(x)p(x) 

1  ~¥F-  s)(x)p(x)|4(dx)  * 

^fAt(x|x')p(x')p(dx') 


(4.23] 

(4.24] 


and 


g^t.sHx) 


{Pm(y|x,t,s)pD(xlt,s)  for  detected  targets 
1— pjj(a 1 1 ,  s  )  for  missed  targets. 


(4.25] 


Using  Lemma  6,  we  can  prove  the  following  theorem. 

Theorem  2,:  For  stationary  targets  and  broadcast  communication,  we  have 
for  every  >€.'Vi(J), 

pCNZ)  -  C-1pQslIr(#N"1)(  TT  p(X  |Z  ))TT  Jtu  (4.26) 

n  C.  N  T€> 

A/ 

where  C  is  a  normalisation  constant,  #N  is  the  number  of  elements  in  M, 
and 


C-25 


The  expected  number  of  targets  undetected  up  to  K  is: 


/n  p(x|z  ,$) 

- WT~ 

(p(x|z,<J>))  N 


(4.28; 


p(x|Z  /Q  and  pXxIZ.T)  are  given  by 


p(x|z  ,r)  = 


P (x  | Z  , T> 


v(Kn)p(x|Zn,4>) 


if  T  ^ 


if  T  = 


(4.29) 


P(x|z,x) 


P(x|z,t) 


V(K)p(x|z,<J>) 


if  T  ^ 


if  x  =  <f> 


(4.30) 


p(xlZn,T)  and  p(x|Z,T)  are  the  state  distributions  at  the  time  of  fusion 
conditioned  by  track  T,  Zq  and  Z.  Furthermore,  the  state  distributions 
can  be  fused  to  obtain 


5.  DYNAMIC  TARGET  MODELS  WITH  BROADCAST  COMMUNICATION 


Suppose  the  data  index  base  after  the  current  broadcast  is  K  and 
the  data  index  base  after  the  last  current  broadcast  is  K. 

Let 

Tj  =  UKt.s)  €  K\K>  (5.1) 

and 

Xj  “  (x^)^  6  T  *  (5.2) 

Then  the  following  theorem  which  can  be  proved  readily  holds  for  dynamic 
target  models  represented  by  a  Markov  process. 

Theorem  3.:  For  dynamic  target  models  with  broadcast  communication,  we 
have  for  every  ^€Vl(J) 

p(Xlz)  -  c"1p(>iz)"(#N'1)(  TT  p(?v  Iz  ))  TT  It:  (5.3) 

n  €  N  “  “  Te> 

. 

where  C  is  a  normalization  constant  and  J^-is  the  same  as  that  in 
Theorem  2  with  x  replaced  by  x^. 

Note  that  this  theorem  states  that  the  likelihood  of  track  associa¬ 
tions  is  now  computed  using  the  entire  state  trajectory  over  the  inter¬ 
val  defined  by  Tj  instead  of  at  just  one  time. 


The  following  two  corollaries  are  easy  to  show. 

Corollary  1.:  Suppose  T^.  contains  only  one  element,  such  as  when  broad¬ 
cast  communication  is  carried  out  at  every  time  instant,  then  the 
evaluation  formula  of  Theorem  3  holds  with  x^  reduced  to  x(t)  if  the 
most  recent  communication  time  is  t. 

Corollary  2:  Suppose  x^(t)  is  a  deterministic  process,  i.e.,  there 
exists  a  group  of  homeomorphic  operators  on  X,  ($t)t  £  such  that 

fA(.(x|x)  =&(x  -  (x))  (5.4) 

where  S(.)  is  the  delta  function  on  (X,p).  Then  Theorem  3  holds  with  x^ 
replaced  b, 

In  the  two  special  cases  mentioned  above,  only  the  state  distribu¬ 
tion  of  the  target  at  a  single  time  is  needed  in  evaluating  the  track 
association  likelihoods.  Otherwise,  one  would  have  to  compute  the  dis¬ 
tribution  of  the  target  state  over  an  interval. 


6.  CONCLUSIONS 


We  have  investigated  the  distributed  multitarget  tracking  and  clas¬ 
sification  problem.  The  approach  is  based  on  a  Bayesian  theory  for  cen¬ 
tralized  multitarget  tracking  and  classification.  Specific  results  are 
given  for  a  case  when  the  communication  is  of  the  broadcast  type  and 
algorithms  for  hypothesis  formation  and  evaluation  are  presented  for 
independent  and  identically  distributed  target  models.  The  target 
dynamics  can  be  both  static  or  dynamic  random  processes. 
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